
HUMAN NEURONAL ATTACHMENT FACTOR ! 



This invention relates to newly identified polynucleotides, polypeptides 
encoded by such polynucleotides, the use of such polynucleotides and 
polypeptides, as well as the production of such polynucleotides and polypeptides. 
More particularly, the polypeptide of the present invention has been putatively 
identified as a human neuronal attachment factor- 1, sometimes hereinafter referred 
to as "NAF-1". The invention also relates to inhibiting the action of such 
polypeptides. 

j^pesding^r^Wisiormr-ap^ 

Background of the Invention 

F-spondin (FSP) is a gene that is predominantly expressed during the early 
development of the vertebrate nervous system. The main function is thought to be 
in neural cell pattern formation and axonal growth. It was found in a subtractive 
hybridization screen designed to isolate floor-plate specific genes. The floor-plate 
provides diffusible signals that act on the neurons that extend from the developing 
spinal cord. These signals can lead to chemoattraction and fasciculation of 
commissural axons in the ventral midline. F-spondin mRNA is expressed at high 
levels in the developing neural tube at the ventral midline even before cell 
differentiation markers can detect the floor- plate. F-spondin is not detectable in 
other regions of the spinal cord until later in embryonic life. There is also transient 
F-spondin expression early in peripheral nerve development which diminishes to 
undetectable levels following birth. The adult central nervous system contains 
F-spondin while the peripheral nerve (sciatic nerve) does not. Outside the adult 
nervous system, organs such as the lung and kidney also express F-spondin. The 
protein is 807 amino acids and codes for a predicted 90 kD polypeptide. The 
apparent size is approximately 116 kD by SDS-PAGE which indicates post- 
radiational modifications such as glycosylation. There are six domains 
homologous to the thrombospondin (TSP) type 1 repeats (TSR) which have been 
shown to control cell adhesion. The protein has been expressed in COS cells and 
purified as a myc-tag fusion protein. This protein was active in promoting neurite 
extension and adhesion of embryonic dorsal root ganglion and dorsal spinal cords 



respectively. It was not chemotropic for embryonic dorsal spinal cord neurons. 

(Klar, A. et al., Cell, 69:95-110 (1992)). 

The C-terminal half of F-spondin contains 6 repeats identified in 

thrombospondin and other proteins implicated in cell adhesion. Thrombospondin is 
5 a 450,000-dalton glyco-protein secreted by platelets in response to such 

physiological activators as thrombin and collagen (Lawler, J., Blood, 67:1197- 

1209 (1986)). TSP comprises 3% of the total platelet protein and 25% of the total 

platelet-secreted proteins (Tuszynski, G.P., et al. f J. Biol. Chem., 260:12240- 

12245 (1985)). Although the precise biological role of TSP has yet to be fully 
10 established, it is generally accepted that TSP plays a major role in cell adhesion and 

cell-cell interactions. It should be pointed out that the C-terminal repeats present in 

thrombospondih may have different biological activities. 

TSP was found to promote the cell-substratum adhesion of a variety of 

cells, including platelets, melanoma cells, smooth muscle cells, endothelial cells, 
15 fibroblasts and epithelial cells (Tuszynski, G.P., et al., Science (Washington, DC), 

236:1570-1573 (1983)). 

Thrombospondin has been postulated to play a role in malarial infection 

induced by only one strain of malaria, 

Plasmodium falciparum. During malarial infection, TSP promotes adhesion of 
20 parasitized red cells to endothelial cells (Roberts, D. D., et al., Nature (Lond.), 
318:64-66 (1984)) and during tumor cell metastases TSP promotes adhesion of 
mouse sarcoma cells to the vascular bed and expression of the malignant phenotype 
of small cell carcinoma (Castle, V. J., J. Clin. Invest, 87:1883-1883 (1991)). 

Properdin is a complement-binding protein which also contains the 6 
25 terminal repeats found in thrombospondin. UNC-5, a C. elegans gene that bears 
two terminal repeats, appears to guide the axonal extension of the sub-set of 
neurons. These proteins, which contain at least one member of the six terminal 
repeats, form a family of proteins which have related functions. 

The gene and polypeptide encoded thereby of the present invention has been 
30 putatively identified as an Neuronal Attachment Factor- 1 protein as a result of amino 
acid sequence homology to rat F-spondin. 

Summary of the Invention 

In accordance with one aspect of the present invention, there is provided a 
35 novel mature polypeptide, as well as biologically active and diagnostically or 
therapeutically useful fragments, analogs and derivatives thereof. The polypeptide 
of the present invention is of human origin. 




In accordance with another aspect of the present invention, there are 
provided isolated nucleic acid molecules encoding a polypeptide of the present 
invention including mRNAs, cDNAs, genomic DNAs as well as analogs and 
biologically active and diagnostically or therapeutically useful fragments thereof. 

In accordance with another aspect of the present invention there is provided 
an isolated nucleic acid molecule encoding a mature polypeptide expressed by the 
human cDNA contained in ATCC Deposit No. 97343. 

In accordance with yet a further aspect of the present invention, there is 
provided a process for producing such polypeptide by recombinant techniques 
comprising culturing recombinant prokaryotic and/or eukaryotic host cells, 
containing a nucleic acid sequence encoding a polypeptide of the present invention, 
under conditions promoting expression of said protein and subsequent recovery of 
said protein. 

In accordance with yet a further aspect of the present invention, there is 
provided a process for utilizing such polypeptide, or polynucleotide encoding such 
polypeptide for therapeutic purposes, for example, to treat spinal cord injuries or 
damage to peripheral nerves by promoting neural cell adhesion and neurite 
extension, to inhibit tumor cell metastases, inhibit endothelial cell proliferation, 
adhesion and motility, to decrease tumor neovascularization, to be angiostatic for 
tumor cells and to promote wound healing. 

In accordance with yet a further aspect of the present invention, there are 
provided antibodies against such polypeptides, which would bind to and neutralize 
NAF-1 to inhibit its putative cell adhesion properties to restrict metastases, 
particularly tumor metastases. 

In accordance with another aspect of the present invention, there are 
provided NAF-1 agonists which mimic NAF-1 and binds to the NAF-1 receptors. 

In accordance with yet another aspect of the present invention, there are 
provided antagonists to such polypeptides, which may be used to inhibit the action 
of such polypeptides, for example, in the treatment of malarial infection caused by 
Plasmodium falciparum. 

In accordance with yet a further aspect of the present invention, there is also 
provided nucleic acid probes comprising nucleic acid molecules of sufficient length 
to hybridize to a nucleic acid sequence of the present invention. 

In accordance with still another aspect of the present invention, there are 
provided diagnostic assays for detecting diseases or susceptibility to diseases related 
to mutations in the nucleic acid sequences encoding a polypeptide of the present 
invention. 
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In accordance with yet a further aspect of the present invention, there is 
provided a process for utilizing such polypeptides, or polynucleotides encoding 
such polypeptides, for in vitro purposes related to scientific research, for example, 
synthesis of DNA and manufacture of DNA vectors. 

These and other aspects of the present invention should be apparent to those 
skilled in the art from the teachings herein. 

Brief Description of the Figures 

The following drawings are illustrative of embodiments of the invention and 
are not meant to limit the scbpe of the invention as encompassed by the claims. 

Figure 1 is an illustration of the cDNA and corresponding amino acid 
sequence of the polypeptide df the present invention. Sequencing was performed 
using a 373 automated DNA sequencer (Applied Biosystems, Inc.). The putative 
leader sequence region is underlined. 

Figure 2 is an amino acid\sequence comparison between the polypeptide^ 
the present invention (bottom line) and rat F-spondin (rFSP) (top line)-p4^- 



y t&TSigure 3 is an aminV acid sLence comparison between the cell adhesion 
sJfence of NAF-1 (FLP-tSR)^anB The six. cell adhesion sequences of rat F- 

-4, -A and -6; SEQ ID NOS:8-13, respectively). 
Also shown is a TSR consensu^ sequence shown in the sequence listing as SEQ ID 

NO: U " Figure 4 shows an analyses of the NAF- 1 amino acid sequence. 

Alpha, beta, turn and coil regions; hydropVilicity and hydrophobicity; amphipathic 
regions; flexible regions; antigenic index aid surface probability are shown. In the 
"Antigenic Index - Jameson-Wolf graph, the positive peaks indicate locations of 
the highly antigenic regions of the NAF-1 protein, i.e., regions from which epitope- 
bearing peptides of the invention can be obtained. 
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Detailed Descrition 

35 In accordance with an aspect of the present invention, there is provided an 

isolated nucleic acid (polynucleotide) which encodes for the mature polypeptide 
having the deduced amino acid sequence of Figure 1 (SEQ ID NO:2). 



The polynucleotide of this invention was discovered in a cDNA library 
derived from human epithelioid sarcoma. It is structurally related to the rat F- 
spondin family. It contains an open reading frame encoding a protein of 331 amino 
acid residues. The protein exhibits the highest degree of homology to rat F-spondin 
with 33.1% identity and 52.9% similarity over the entire amino acid stretch. The 
gene of the present invention shows the greatest homology at the nucleotide level to 
the rat F-spondin gene with 66% similarity and 66% identity. It is also important 
that the polypeptide of the present invention contains the conserved motif, WSXW, 
which is a potential binding sequence for polypeptides in this family. 

Northern blot analysis of the protein of the present invention showed a 
broad band at 1.6-1.9 kb in liver and lower level expression in kidney, lung, heart 
and placenta. Brain expression was barely detectable. Two libraries which were 
constructed from tissues induced to undergo apoptosis, apoptotic t-cells (HTG) and 
TNF induced amniotic cells (HAU), had one clone in each. By extrapolation, 
NAF-1 was represented at least 50 times more frequently in apoptotic t-cells 
expressed sequence tags than all normal and activated t-cell libraries. In the TNF 
induced amniotic cells library, NAF-1 was detected 1 out of 2,414 expressed 
sequence tags versus 0 out of 3,595 expressed sequence tags for the non-TNF 
treated amniotic cell library. 

The NAF-1 cDNA contains an open reading frame encoding a polypeptide 
of 35.8 kD. Amino acids 1-23 and 1-26 encode putative signal peptides. 
Accordingly, there are two species of predicted mature NAF-1 polypeptides one 
having 311 and the other 314 amino acids. NAF-1 also contains a putative re- 
linked glycosylate site at position 303. The homology of NAF-1 to FSP covers 
amino acids 199-495 of the latter protein. Thus, NAF-1 does not appear to be the 
human counterpart of the rat FSP. NAF-1 contains only one TSR which begins at 
amino acid 278. This region is much more homologous to FSP type 1 repeats than 
to those of TSP, 38% versus 20%, respectively. The homology between the NAF- 
1 TSR and the six FSP type-1 repeats is shown in Figure 3. The amino terminal 
277 amino acids of NAF-1 share homology to FSP but show no resemblance to any 

other known proteins. 

In accordance with another aspect of the present invention there are provided 
isolated polynucleotides encoding a mature polypeptide expressed by the human 
cDNA contained in ATCC Deposit No. 97343, deposited with the American Type 
Culture Collection, 12301 Park Lawn Drive; Rockville, Maryland 20852, USA, on 
November 20, 1995. The deposited material is a pBluescript SK (-) (Stratagene, La 
Jolla, CA) plasmid that contains the full-length NAF-1 cDNA. The NAF-1 cDNA 
has been cloned into the EcoRI, Xhol site. 



The deposit has been made under the terms of the Budapest Treaty on the 
International Recognition of the Deposit of Micro-organisms for purposes of Patent 
Procedure. The strain will be irrevocably and without restriction or condition 
released to the public upon the issuance of a patent. The deposit is provided merely 
as convenience to those of skill in the art and are not an admission that a deposit is 
required under 35 U.S.C. §112. The sequence of the polynucleotide contained in 
the deposited material, as well as the amino acid sequence of the polypeptides 
encoded thereby, are controlling in the event of any conflict with any description of 
sequences herein. A license may be required to make, use or seU the deposited 
material, and no such license is hereby granted. 

The polynucleotide of the present invention may be in the form of RNA or 
in the form of DNA, which DNA includes cDNA, genomic DNA, and synthetic 
DNA. The DNA may be double-stranded or single-stranded, and if single stranded 
may be the coding strand or non-coding (anti-sense) strand. The coding sequence 
5 which encodes the mature polypeptide may be identical to the coding sequence 
shown in Figure 1 (SEQ ID NO:l) or may be a different coding sequence which 
coding sequence, as a result of the redundancy or degeneracy of the genetic code, 
encodes the same mature polypeptide as the DNA of Figure 1 (SEQ ID NO:l). 

The polynucleotide which encodes for the mature polypeptide of Figure 1 
(SEQ ID NO:2) may include, but is not limited to: only the coding sequence for the 
mature polypeptide; the coding sequence for the mature polypeptide and additional 
coding sequence such as a leader or secretory sequence or a proprotein sequence; 
the coding sequence for the mature polypeptide (and optionally additional coding 
sequence) and non-coding sequence, such as introns or non-coding sequence 5 1 
25 and/or 3' of the coding sequence for the mature polypeptide. 

Thus, the term "polynucleotide encoding a polypeptide" encompasses a 
polynucleotide which includes only coding sequence for the polypeptide as well as a 
polynucleotide which includes additional coding and/or non-coding sequence. 

The present invention further relates to variants of the hereinabove described 
polynucleotides which encode for fragments, analogs and derivatives of the 
polypeptide having the deduced amino acid sequence of Figure 1 (SEQ ID NO:2). 
The variant of the polynucleotide may be a naturally occurring allelic variant of the 
polynucleotide or a non-naturally occurring variant of the polynucleotide. 

Thus, the present invention includes polynucleotides encoding the same 
35 mature polypeptide as shown in Figure 1 (SEQ ID NO:2) as well as variants of such 
polynucleotides which variants encode for a fragment, derivative or analog of the 
' polypeptide of Figure 1 (SEQ ID NO:2). Such nucleotide variants include deletion 
variants, substitution variants and addition or insertion variants. 
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As hereinabove indicated, the polynucleotide may have a coding sequence 
which is a naturally occurring allelic variant of the coding sequence shown in Figure 
1 (SEQ ID NO:l). As known in the art, an allelic variant is an alternate form of a 
polynucleotide sequence which may have a substitution, deletion or addition of one 
5 or more nucleotides, which does not substantially alter the function of the encoded 
polypeptide. 

The present invention also includes polynucleotides, wherein the coding 
sequence for the mature polypeptide may be fused in the same reading frame to a 
polynucleotide sequence which aids in expression and secretion of a polypeptide 
10 from a host cell, for example, a leader sequence which functions as a secretory 
sequence for controlling transport of a polypeptide from the cell. The polypeptide 
having a leader sequence is a preprotein and may have the leader sequence cleaved 
by the host cell to form the mature form of the polypeptide. The polynucleotides 
may also encode for a proprotein which is the mature protein plus additional 5' 
amino acid residues. A mature protein having a prosequence is a proprotein and is 
an inactive form of the protein. Once the prosequence is cleaved an active mature 
protein remains. Thus, for example, the polynucleotide of the present 
invention may encode for a mature protein, or for a protein having a prosequence or 
for a protein having both a prosequence and a presequence (leader sequence). 
20 The polynucleotides of the present invention may also have the coding 

sequence fused in frame to a marker sequence which allows for purification of the 
polypeptide of the present invention. The marker sequence may be a hexa-histidine 
tag supplied by a pQE-9 vector to provide for purification of the mature polypeptide 
fused to the marker in the case of a bacterial host, or, for example, the marker 
25 sequence may be a hemagglutinin (HA) tag when a mammalian host, e.g. COS-7 
cells, is used. The HA tag corresponds to an epitope derived from the influenza 
hemagglutinin protein (Wilson, I., et al., Cell, 37:767 (1984)). 

Most highly preferred are nucleic acid molecules encoding the mature 
protein having the amino acid sequence shown in SEQ ID NO:2 as residues 24-331 
30 or 27-331, or the mature NAF-1 amino acid sequence encoded by the deposited 

cDNA clone. . 

Also highly preferred are nucleic acid molecules encoding the IbK domain 

of the protein having the amino acid sequence shown in SEQ ID NO:8 or the TSR 

domain of the NAF-1 amino acid sequence encoded by the deposited cDNA clone. 

35 Thus, one aspect of the invention provides an isolated nucleic acid 

molecule comprising a polynucleotide having a nucleotide sequence selected from 

the group consisting of: (a) a nucleotide sequence encoding a full-length NAF-1 
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polypeptide having the complete amino acid sequence in SEQ ID NO:2, or the 
complete amino acid sequence encoded by the cDNA clone contained in the 
ATCC Deposit No. 97343; (b) a nucleotide sequence encoding a full-length NAF- 
1 polypeptide having the complete amino acid sequence in SEQ ID NO:2 
^ 5 excepting the N-terminal methionine (i.e.,.pesitiens-l-te^34 of SEQ ID NO:2) or 

the complete amino acid sequence excepting the N-terminal methionine encoded by 
the cDNA clone contained in the ATCC Deposit No. 97343 ; (c) a nucleotide 
sequence encoding a predicted mature form of the NAF-1 polypeptide having the 
amino acid sequence at positions 24-33 1 or 27-33 1 in SEQ ID NO:2 or as encoded 
by the cDNA clone contained in the ATCC Deposit No. 97343; (d) a nucleotide 
sequence encoding a polypeptide comprising the predicted TSR domain of the 
NAF-1 polypeptide having the amino acid sequence at positions 284-330 in SEQ 
ID NO:2 or as encoded by the cDNA clone contained in the ATCC Deposit No. 
97343; and (e) a nucleotide sequence complementary to any of the nucleotide 
15 sequences in (a), (b), (c) or (d) above. 

Further embodiments of the invention include isolated nucleic acid 
molecules that comprise a polynucleotide having a nucleotide sequence at least 
90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% 
identical, to any of the nucleotide sequences in (a), (b), (c), (d) or (e), above, or a 
20 polynucleotide which hybridizes under stringent hybridization conditions to a 
polynucleotide in (a), (b), (c), (d) or (e), above. This polynucleotide which 
hybridizes does not hybridize under stringent hybridization conditions to a 
polynucleotide having a nucleotide sequence consisting of only A residues or of 
only T residues: An additional nucleic acid embodiment of the invention relates to 
an isolated nucleic acid molecule comprising a polynucleotide which encodes the 
amino acid sequence of an epitope-bearing portion of a NAF-1 polypeptide having 
an amino acid sequence in (a), (b), (c), (d) or (e), above. 

The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, and to host cells 
30 containing the recombinant vectors, as well as to methods of making such vectors 
and host cells and for using them for production of NAF-1 polypeptides or 
peptides by recombinant techniques. 



25 



9 



10 



By a polynucleotide having a nucleotide sequence at least, for example, 
95% "identical" to a reference nucleotide sequence encoding a NAF-1 polypeptide 
is intended that the nucleotide sequence of the polynucleotide is identical to the 
reference sequence except that the polynucleotide sequence may include up to five 
point mutations per each 100 nucleotides of the reference nucleotide sequence 
encoding the NAF-1 polypeptide. In other words, to obtain a polynucleotide 
having a nucleotide sequence at least 95% identical to a reference nucleotide 
sequence, up to 5% of the nucleotides in the reference sequence may be deleted or 
substituted with another nucleotide, or a number of nucleotides up to 5% of the 
total nucleotides in the reference sequence may be inserted into the reference 
sequence. These mutations of the reference sequence may occur at the 5' or 3' 
terminal positions of the reference nucleotide sequence or anywhere between those 
terminal positions, interspersed either individually among nucleotides in the 
reference sequence or in one or more contiguous groups within the reference 
15 sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 90%, 95%, 96%, 97%, 98% or 99% identical to, for instance, the nucleotide 
sequence shown in Figure 1 or to the nucleotides sequence of the deposited cDNA 
clone can be determined conventionally using known computer programs such as 
20 the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 5371 1). Bestfit use s the local homology algorithm of Smith and 
Waterman, Advances in Applied Mathematics 2:482-489 (1981), to find the best 
segment of homology between two sequences. When using Bestfit or any other 
25 sequence alignment program to determine whether a particular sequence is, for 

instance, 95% identical to a reference sequence according to the present invention, 
the parameters are set, of course, such that the percentage of identity is calculated 
over the full length of the reference nucleotide sequence and that gaps in homology 
of up to 5% of 'he total number of nucleotides in the reference sequence are 

30 allowed. ^ 

The present application is directed to nucleic acid molecules at least 90%, 
95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in 
Figure 1 (SEQ ID NO: 1) or to the nucleic acid sequence of the deposited cDNA, 
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irrespective of whether they encode a polypeptide having NAF-1 activity. This is 
because even where a particular nucleic acid molecule does not encode a 
polypeptide having NAF-1 activity, one of skill in the art would still know how to 
use the nucleic acid molecule, for instance, as a hybridization probe or a 
polymerase chain reaction (PCR) primer. Uses of the nucleic acid molecules of the 
present invention that do not encode a polypeptide having NAF-1 activity 
include, inter aha, (1) isolating the NAF-1 gene or allelic variants thereof in a 
cDNA library; (2) in situ hybridization (e.g., "FISH") to metaphase chromosomal 
spreads to provide precise chromosomal location of the NAF-1 gene, as described 
in Verma et al., Human Chromosomes: A Manual of Basic Techniques, Pergamon 
Press, New York (1988); and Northern Blot analysis for detecting NAF-1 mRNA 
expression in specific tissues. 

Preferred, however, are nucleic acid molecules having sequences at least 
90%, 95%, 96%, 97%, 98% or 99% identical to the nucleic acid sequence shown in 
Figure 1 (SEQ ID NO:l) or to the nucleic acid sequence of the deposited cDNA 
which do, in fact, encode a polypeptide having NAF-1 protein activity. By "a 
polypeptide having NAF-1 activity" is intended polypeptides exhibiting activity 
similar, but not necessarily identical, to an activity of the mature protein of the 
invention, as measured in a particular biological assay. For example, the NAF-1 
protein of the present invention causes axonal neurite extension and promotes 
neural cell adhesion. Such activity can be assayed as described in Klar, et al., Cell 
69:95-1 10, incorporated herein by reference. 

NAF-1 protein modulates axonal neurite extension and neural cell adhesion 
in a dose-dependent manner in the above-described assay. Thus, "a polypeptide 
having NAF-1 protein activity" includes polypeptides that also exhibit any of the 
same neurite extension and neural cell adhesion promoting activities in the above- 
described assays in a dose-dependent manner. Although the degree of dose- 
dependent activity need not be identical to that of the NAF-1 protein, preferably, 
"a polypeptide having NAF-1 protein activity" will exhibit substantially similar 
dose-dependence in a given activity as compared to the NAF-1 protein (i.e., the 
candidate polypeptide will exhibit greater activity or not more than about 25-fold 
less and, preferably, not more than about tenfold less activity relative to the 
reference NAF-1 protein). 
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Of course, due to the degeneracy of the genetic code, one of ordinary skill 
in the art will immediately recognize that a large number of the nucleic acid 
molecules having a sequence at least 90%, 95%, 96%, 97%, 98%, or 99% identical • 
to the nucleic acid sequence of the deposited cDNA or the nucleic acid sequence 
shown in Figure 1 (SEQ ID NO:l) will encode a polypeptide "having NAF-1 
protein activity." In fact, since degenerate variants of these nucleotide sequences 
all encode the same polypeptide, this will be clear to the skilled artisan even 
without performing the above described comparison assay. It will be further 
recognized in the art that, for such nucleic acid molecules that are not degenerate 
variants, a reasonable number will also encode a polypeptide having NAF-1 
protein activity. This is because the skilled artisan is fully aware of amino acid 
substitutions that are either less likely or not likely to significantly effect protein 
function (e.g., replacing one aliphatic amino acid with a second aliphatic amino 
acid), as further described below. 
; The term "gene" means the segment of DNA involved in producing a 

polypeptide chain; it includes regions preceding and following the coding region 
(leader and trailer) as well as intervening sequences (introns) between individual 

coding segments (exons). 

The present invention is further directed to nucleic acid molecules encoding 

D portions of the nucleotide sequences described herein as well as to fragments of 
the isolated nucleic acid molecules described herein. In particular, the invention 
provides a polynucleotide having a nucleotide sequence representing the portion of 
SEQ ID NO:l which consists of positions 1-1010 of SEQ ID NO:l. 
In addition, the invention provides nucleic acid molecules having nucleotide 
25 sequences related to extensive portions of SEQ ID NO:l which have been 
determined from the following related cDNA clones: HLHCE24R (shown as SEQ 
ID NO: 16); HLHDR83R (shown as SEQ ID NO: 17) and HPTSB36R (shown as 

SEQ ID NO: 18). • • * 

Further, the invention includes a polynucleotide comprising any portion ot 

30 at least about 30 nucleotides, preferably at least about 50 nucleotides, of SEQ ID 
NO:l from residue 1-650. 

More generally, by a fragment of an isolated nucleic acid molecule having the 
nucleotide sequence of the deposited cDNA or the nucleotide sequence shown in 
Figure 1 (SEQ ID NO:l) is intended fragments at least about 15 nt, and more 
35 preferably at least about 20 nt, still more preferably at least about 30 nt, and even 
more preferably, at least about 40 nt in length which are useful as diagnostic 
probes and primers as discussed herein. Of course, larger fragments 50-300 nt in 
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length are also useful according to the present invention as are fragments 
corresponding to most, if not all, of the nucleotide sequence of the deposited 
cDNA or as shown in Figure 1 (SEQ ID NO:l). By a fragment at least 20 nt in 
length for example, is intended fragments which include 20 or more contiguous 
bases from the nucleotide sequence of the deposited cDNA or the nucleotide 
sequence as shown in Figure 1 (SEQ ID NO:l). Preferred nucleic acid fragments 
of the present invention include nucleic acid molecules encoding epitope-beanng 
portions of the NAF-1 polypeptide as identified in Figure 4 and described in more 
detail below. 

Fragments of the full length gene of the present invention may be used as a 
hybridization probe for a cDNA library to isolate the full length cDNA and to isolate 
other cDNAs which have a high sequence similarity to the gene or similar biological 
activity. Probes of this type preferably have at least 30 bases and may contain, for 
example, at least 50 bases. The probe may also be used to identify a cDNA clone 
corresponding to a full length transcript and a genomic clone or clones that contain 
the complete gene including regulatory and promoter regions, exons, and introns. 
An example of a screen comprises isolating the coding region of the gene by using 
the known DNA sequence to synthesize an oligonucleotide probe. Labeled 
oligonucleotides having a sequence complementary to that of the gene of the present 
invention are used to screen a library of human cDNA, genomic DNA or mRNA to 
determine which members of the library the probe hybridizes to. 

The present invention further relates to polynucleotides which hybridize to 
the hereinabove-described sequences if there is at least 70%, preferably at least 
90% and more preferably at least 95%, 96%, 97%, 98% or 99% identity between 
the sequences. The present invention particularly relates to polynucleotides which 
hybridize under stringent conditions to the hereinabove-described polynucleotides. 
As herein used, the term "stringent hybridization conditions" is intended overnight 
incubation at 42° C in a solution comprising: 50% formamide, 5x SSC (150 mM 
NaCl 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x 
Denhardt's solution, 10% dextran sulfate, and 20 ug/ml denatured, sheared salmon 
sperm DNA, followed by washing the filters in O.lx SSC at about 65° C. The 
polynucleotides which hybridize to the hereinabove described polynucleotides in a 
preferred embodiment encode polypeptides which either retain substantially the 
same biological function or activity as the mature polypeptide encoded by the 
cDNAs of Figure 1 (SEQ ID NO:l). 

Alternatively, the polynucleotide may have at least 15 bases, preferably at 
least 30 bases, and more preferably at least 50 bases which hybridize to a 
polynucleotide of the present invention and which has an identity thereto, as 
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described, and which may or may not retain activity. For example, such 
polynucleotides may be employed as probes for the polynucleotide of SEQ ID 
NO: 1 , for example, for recovery of the polynucleotide or as a diagnostic probe or as 
a PCR primer. 

5 Thus, the present invention is directed to polynucleotides having at least a 

70% identity', preferably at least 90% and more preferably at least a 95%, 96%, 
97%, 98%, c r 99% identity to a polynucleotide which encodes the polypeptide of 
SEQ ID NO:2 and polynucleotides complementary thereto as well as portions 
thereof, which portions have at least 30 consecutive bases and preferably at least 50 

10 consecutive bases and to polypeptides encoded by such polynucleotides. 

The present invention further relates to a polypeptide which has the deduced 
amino acid sequence of Figure 1 (SEQ ID NO:2), as well as fragments, analogs and 

derivatives of such polypeptide. 

To improve or alter the characteristics of NAF-1 polypeptides, protein 

15 engineering may be employed. Recombinant DNA technology known to those 

skilled in the art can be used to create novel mutant proteins or "muteins including 
single or multiple amino acid substitutions, deletions, additions or fusion proteins. 
Such modified polypeptides can show, e.g., enhanced activity or increased 
stability. In addition, they may be purified in higher yields and show better 
20 solubility than the corresponding natural polypeptide, at least under certain 
purification a,K storage conditions. 

For instance, for many proteins, including the extracellular domain of a 
membrane associated protein or the mature form(s) of a secreted protein, it is 
known in the art that one or more amino acids may be deleted from the N-terminus 
25 or C-terminus without substantial loss of biological function. For instance, Ron et 
al., Biol. Chem., 255:2984-2988 (1993) reported modified KGF proteins that 
had heparin binding activity even if 3, 8, or 27 ammo-terminal amino acid residues 
were missing. In the present case, since the protein of the invention contains a 
TSR repeat, deletions of N-terminal amino acids up to the cysteine at position 284 
30 (C284) of SEQ ID NO:2 may retain some biological activity such as the ability to 
promote cell adhesion, however, additional deletions including C284 would not be 
expected to retain such biological activities because it is known that this residue in 
the TSR repeat is required for secondary structure necessary to promote cell 
adhesion. 
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However, even if deletion of one or more amino acids from the N-terminus 
of a protein results in modification or loss of one or more biological functions of 
the protein, other biological activities may still be retained. Thus, the ability of 
the shortened protein to induce and/or bind to antibodies which recognize the 
complete or TSR domain of the protein generally will be retained when less than 
the majority of the residues of the complete or TSR domain are removed from the 
N-terminus. Whether a particular polypeptide lacking N-terminal residues of a 
complete protein retains such immunologic activities can readily be determined by 
routine methods described herein and otherwise known in the art. 

Accordingly, the present invention further provides polypeptides having 
one or more residues deleted from the amino terminus of the amino acid sequence 
of the NAF-1 shown in SEQ ID NO:2, up to the C284 residue, and 
polynucleotides encoding such polypeptides. In particular, the present invention 
provides polypeptides comprising the amino. 

Similarly, many examples of biologically functional C-terminal deletion 
muteins are known. For instance, interferon gamma shows up to ten times higher 
activities by deleting 8-10 amino acid residues from the carboxy terminus of the 
protein (Dobeli et al., J. Biotechnology 7:199-216 (1988). However, even if 
deletion of one or more amino acids from the C-terminus of a protein results in 
modification of loss of one or more biological functions of the protein, other 
biological activities may still be retained. Thus, the ability of the shortened 
protein to induce and/or bind to antibodies which recognize the complete or TSR 
domain of the protein generally will be retained when less than the majority of the 
residues of the complete or TSR domain protein are removed from the C-terminus. 
Whether a particular polypeptide lacking C-terminal residues of a complete 
protein retains such immunologic activities can readily be determined by routine 
methods described herein and otherwise known in the art. 

Accordingly, the present invention further provides polypeptides having 
one or more residues from the carboxy terminus of the amino acid sequence of the 
NAF-1 shown in SEQ ID NO:2, up to the C330 residue of SEQ ID NO:2, and 
polynucleotides encoding such polypeptides. In particular, the present invention 
provides polypeptides having the amino acid sequence of residues 1-m of the 
amino acid sequence in SEQ ID NO:2, where m is either 330 or 331, and C330 is 
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the position of the first residue from the C- terminus of the complete NAF-1 
polypeptide (shown in SEQ ID NO:2) believed to be required for cell adhesion of 

the NAF-1 protein. 

The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxyl termini, which may be described 
generally as having residues n-m of SEQ ID NO:2, where n and m are integers as 
described above. 

Also included are a nucleotide sequence encoding a polypeptide consisting 
of a portion of the complete NAF-1 amino acid sequence encoded by the cDNA 
clone contained in ATCC Deposit No. 97343, where this portion excludes from 1 
to about 283 amino acids from the amino terminus of the complete amino acid 
sequence encoded by the cDNA clone contained in ATCC Deposit No. 97343, or 
1 amino acid from the carboxy terminus, or any combination of the above amino 
terminal and carboxy terminal deletions, of the complete amino acid sequence 
encoded by the cDNA clone contained in ATCC Deposit No. 97343. 
Polynucleotides encoding all of the above deletion mutant polypeptide forms also 
are provided. 

In addition to terminal deletion forms of the protein discussed above, it 
also will be recognized by one of ordinary skill in the art that some amino acid 
sequences of the NAF-1 polypeptide can be varied without significant effect of 
the structure or function of the protein. If such differences in sequence are . 
contemplated, it should be remembered that there will be critical areas on the 
protein which determine activity. 

Thus, the invention further includes variations of the NAF-1 polypeptide 
which show substantial NAF-1 polypeptide activity or which include regions of 
NAF-1 protein such as the protein portions discussed below. Such mutants 
include deletes, insertions, inversions, repeats, and type substitutions selected 
according to general rules known in the art so as have little effect on activity. For 
example, guidance concerning how to make phenotypically silent amino acid 
substitutions is provided in Bowie, J. U. et al., "Deciphering the Message in 
Protein Sequences: Tolerance to Amino Acid Substitutions," Science 
247:1306-1310 (1990), wherein the authors indicate that there are two mam 
approaches for studying the tolerance of an amino acid sequence to change. The 
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first method relies on the process of evolution, in which mutations are either 
accepted or rejected by natural selection. The second approach uses genetic 
engineering to introduce amino acid changes at specific positions of a cloned gene 
and selections or screens to identify sequences that maintain functionality. 

As the authors state, these studies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors further indicate 
which amino acid changes are likely to be permissive at a certain position of the 
protein. For example, most buried amino acid residues require nonpolar side 
chains, whereas few features of surface side chains are generally conserved. Other 
such phenotypically silent substitutions are described in Bowie, J. U. et al, 
supra, and the references cited therein. Typically seen as conservative ^ 
substitutions are the replacements, one for another, among the aliphatic amino 
acids Ala, Val, Leu and He; interchange of the hydroxyl residues Ser and Thr, 
exchange of the acidic residues Asp and Glu, substitution between the amide 
5 residues Asn and Gin, exchange of the basic residues Lys and Arg and 
replacements among the aromatic residues Phe, Tyr. 

Thus, the fragment, derivative or analog of the polypeptide of SEQ ID 
NO:2, or that encoded by the deposited cDNA, may be (i) one in which one or 
more of the amino acid residues are substituted with a conserved or non-conserved 
0 amino acid residue (preferably a conserved amino acid residue) and such 

substituted amino acid residue may or may not be one encoded by the genetic 
code, or (ii) one in which one or more of the amino acid residues includes a 
substituent group, or (iii) one in which the mature polypeptide is fused with 
another compound, such as a compound to increase the half-life of the 
25 polypeptide (for example, polyethylene glycol), or (iv) one in which the additional 
amino acids are fused to the above form of the polypeptide, such as an IgG Fc 
fusion region peptide or leader or secretory sequence or a sequence which is 
employed for purification of the above form of the polypeptide or a proprotein 
sequence. Such fragments, derivatives and analogs are deemed to be within the 
30 scope of those skilled in the art from the teachings herein 

Thus, the NAF-1 of the present invention may include one or more amino 
acid substitutions, deletions or additions, either from natural mutations or human 
manipulation. As indicated, changes are preferably of a minor nature, such as 
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conservative amino acid substitutions that do not significantly affect the folding or 

activity of the protein (see Table 1). 

TABLE 1. Conservative Amino Acid Substitutions. 



Aromatic 


Phenylalanine 




Tryptophan 




Tyrosine 


Hydrophobic 


Leucine 


Isoleucine 




Valine 


Polar 


Glutamine 




Asparagine 


Basic 


A rcrininp 

XX-L ci ill HI w 




T vQitiP 




Histidine 


Acidic 


Aspartic Acid 




Glutamic Acid 


Small 


Alanine 




Serine 




Threonine 




Methionine 




Glycine 



Amino acids in the NAF-1 protein of the present invention that are 
essential for function can be identified by methods known in the art, such as site- 
directed mutagenesis or alanine-scanning mutagenesis (Cunningham and Wells, 
Science 244:1081-1085 (1989)). The latter procedure introduces single alanine 
mutations at every residue in the molecule. The resulting mutant molecules are 
then tested for biological activity such as receptor binding or in vitro or in vitro 

proliferative activity. 

Of special interest are substitutions of charged amino acids with other 
charged or neutral amino acids which may produce proteins with highly desirable 
improved characteristics, such as less aggregation. Aggregation may not only 
reduce activity but also be problematic when preparing pharmaceutical 
formulations, because aggregates can be immunogenic (Pinckard et al, Clin. Exp. 
Immunol. 2:331-340 (1967); Robbins et al., Diabetes 36: 838-845 (1987); Cleland 
etal, Crit. Rev. Therapeutic Drug Carrier Systems 70:307. 



Figure 3 shows the consensus TSR sequence (Sequence ID NO: 14). 
Preferred mutants having increased cell adhesion activity are those with 
substitutions making the NAF-1 polypeptides more similar to the consensus 
sequence. 

The terms "fragment," "derivative" and "analog" when referring to the 
polypeptide of Figure 1 (SEQ ID NO:2), means a polypeptide which retains 
essentially the same biological function or activity as such polypeptide. Thus, an 
analog includes a proprotein which can be activated by cleavage of the proprotein 
portion to produce an active mature polypeptide. 

The polypeptide of the present invention may be a recombinant polypeptide, 
a natural polypeptide or a synthetic polypeptide, preferably a recombinant 
polypeptide. 

The fragment, derivative or analog of the polypeptide of Figure 1 (SEQ ID 
NO:2) may be (i) one in which one or more of the amino acid residues are 
substituted with a conserved or non-conserved amino acid residue (preferably a 
conserved amino acid residue) and such substituted amino acid residue may or may 
not be one encoded by the genetic code, or (ii) one in which one or more of the 
amino acid residues includes a substituent group, or (iii) one in which the mature 
polypeptide is fused with another compound, such as a compound to increase the 
half-life of the polypeptide (for example, polyethylene glycol), or (iv) one in which 
the additional amino acids are fused to the mature polypeptide, such as a leader or 
secretory sequence or a sequence which is employed for purification of the mature 
polypeptide or a proprotein sequence. Such fragments, derivatives and analogs are 
deemed to be within the scope of those skilled in the art from the teachings herein. 

The polypeptides and polynucleotides of the present invention are preferably 
provided in an isolated form, and preferably are purified to homogeneity. 

The term "isolated" means that the material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For 
example, a naturally-occurring polynucleotide or polypeptide present in a living 
animal is not isolated, but the same polynucleotide or polypeptide, separated from 
some or all of the coexisting materials in the natural system, is isolated. Such 
polynucleotides could be part of a vector and/or such polynucleotides or 
polypeptides could be part of a composition, and still be isolated in that such vector 
or composition is not part of its natural environment. 
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The invention further provides an isolated NAF-1 polypeptide comprising 
an amino acid sequence selected from the group consisting of: (a) the amino acid 
sequence of the full-length NAF-1 polypeptide having the complete amino acid 
sequence shown in SEQ ID NO:2 or the complete amino acid sequence excepting 
the N-terminal methionine encoded by the cDNA clone contained in the ATCC 
Deposit No. 97343; (b) the amino acid sequence of the full-length NAF-1 
polypeptide having the complete amino acid sequence shown in SEQ ID NO:2 
excepting the N-terminal methionine (i.e., positions 1-331 of SEQ ID NO:2) or the 
complete amino acid sequence excepting the N-terminal methionine encoded by the 
cDNA clone contained in the ATCC Deposit No. 97343 ; (c) the amino acid 
sequence of the mature NAF-1 polypeptide having the amino acid sequence of 
residues 24-33 1 or 27-33 1 in SEQ ID NO:2, or the mature NAF-1 amino acid 
sequence as encoded by the cDNA clone contained in ATCC Deposit No. 97343; 
and (d) the amino acid sequence of the TSR domain of NAF-1 having the amino 
acid sequence of residues 284 to 330 of SEQ ID NO:2, or the amino acid sequence 
of the TSR domain of NAF-1 encoded by the cDNA clone contained in ATCC 
Deposit No. 97343. 

Further polypeptides of the present invention include polypeptides which 
have at least 90% similarity, more preferably at least 95% similarity, and still 
more preferably at least 96%, 97%, 98% or 99% similarity to those described 
above. The polypeptides of the invention also comprise those which are at least 
80% identical, more preferably at least 90% or 95% identical, still more preferably 
at least 96%, 97%, 98% or 99% identical to the polypeptide encoded by the 
deposited cDNA or to the polypeptide of SEQ ID NO:2, and also include 
portions of such polypeptides with at least 30 amino acids and more preferably at 

least 50 amino acids. 

By "% similarity" for two polypeptides is intended a similarity score 
produced by comparing the amino acid sequences of the two polypeptides using 
the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 5371 1) and the default settings for determining similarity. Bestfit 
uses the local homology algorithm of Smith and Waterman (Advances in Applied 
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Mathematics 2:482-489, 1981) to find the best segment of similarity between two 

sequences. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a reference amino acid sequence of aNAF-1 polypeptide is intended 
that the amino acid sequence of the polypeptide is identical to the reference 
sequence except that the polypeptide sequence may include up to five amino acid 
alterations per each 100 amino acids of the reference amino acid of the NAF-1 
polypeptide, iu other words, to obtain a polypeptide having an amino acid 
sequence at least 95% identical to a reference amino acid sequence, up to 5% of the 
amino acid residues in the reference^quence may be deleted or substituted with 
another amino acid, or a number of amino acids up to 5% of the total amino acxd 
residues in the reference sequence may be inserted into the reference sequence. 
These alterations of the reference sequence may occur at the amino or carboxy 
terminal positions of the reference amino acid sequence or anywhere between 
those terminal positions, interspersed either individually among residues in the 
reference sequence or in one or more contiguous groups within the reference 
sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 
95% 96%, 97%, 98% or 99% identical to, for instance, the amino acid sequence 
shown in SEQ ID NO:2 or to the amino acid sequence encoded by deposited 
cDNA clone can be determined conventionally using known computer programs 
such the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for 
Unix Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 53711). When using Bestfit or any other sequence alignment 
program to determine whether a particular sequence is, for instance, 95% identical 
to a reference sequence according to the present invention, the parameters are set, 
of course, such that the percentage of identity is calculated over the full length of 
the reference amino acid sequence and that gaps in homology of up to 5% of the 
total number of amino acid residues in the reference sequence are allowed 

In another aspect, the invention provides a peptide or polypeptide 
comprising an epitope-bearing portion of a polypeptide of the invention. The 
epitope of this polypeptide portion is an immunogenic or antigenic epitope of a 
polypeptide of the invention. An "immunogenic epitope" is defined as a part of a 
protein that elicits an antibody response when the whole protein is the 
immunogen. On the other hand, a region of a protein molecule to which an 
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antibody can bind is defined as an "antigenic epitope." The number of 
immunogenic epitopes of a protein generally is less than the number of antigenic 
epitopes. See, for instance, Geysen et al, Proc. Natl. Acad. Sci. USA 57:3998- 
4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic 
epitope (i.e., that contain a region of a protein molecule to which an antibody can 
bind), it is well known in that art that relatively short synthetic peptides that 
mimic part of a protein sequence are routinely capable of eliciting an antiserum 
that reacts with the partially mimicked protein. See, for instance, Sutcliffe, J. G., 
Shinnick, T. M„ Green, N. and Learner, R. A. (1983) "Antibodies that react with 
predetermined sites on proteins," Science, 219:660-666. Peptides capable of 
eliciting protein-reactive sera are frequently represented in the primary sequence 
of a protein, can be characterized by a set of simple chemical rules, and are 
confined neither to immunodominant regions of intact proteins (i.e., immunogenic 
epitopes) nor to the amino or carboxyl terminals. Antigenic epitope-bearing 
peptides and polypeptides of the invention are therefore useful to raise antibodies, 
including monoclonal antibodies, that bind specifically to a polypeptide of the 
invention. See, for instance, Wilson et al, Cell 37:161-11% (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention 
preferably contain a sequence of at least seven, more preferably at least nine and 
most preferably between about 15 to about 30 amino acids contained within the 
amino acid sequence of a polypeptide of the invention. Non-limiting examples of 
antigenic polypeptides or peptides that can be used to generate NAF-1 -specific^ 
antibodies include: a polypeptide comprising amino acid residues from about fm- 

^polypeptide comprising amino acid residues from about 
te^^t^^f a polypeptide comprising amino acid residues from 
about A ^MS atffl iS SS; 1 a. polypeptide comprising amino acid residues 
from about iSSSS^^ftS^iT^ a polypeptide comprising amino acid 
residues from about ffi fefl ' ^ ab ^ t sSr ^^These polypeptide fragments have 
been determined to bear antigenic epitopes of the NAF-1 protein by the analysis 
of the Jameson-Wolf antigenic index, as shown in Figure 4, above. 

The epitope-bearing peptides and polypeptides of the invention may be 
produced by any conventional means. See, e.g., Houghten, R. A. (1985) 
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"General method for the rapid solid-phase synthesis of large numbers of peptides: 
specificity of antigen-antibody interaction at the level of individual amino acids." 
Proc. Natl. Acad. Sci. USA §2:5131-5135; this "Simultaneous Multiple Peptide 
Synthesis (SMPS)" process is further described in U.S. Patent No. 4,631,21 1 to 
Houghten et al. (1986). 

Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe 
et al., supra; Wilson et al., supra; Chow, M. et al, Proc. Natl. Acad. Sci. USA 
52:910-914; and Bittle, F. J. et al, J. Gen. Virol 55:2347-2354 (1985). 
Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a 
protein that elicit an antibody response when the whole protein is the immunogen, 
are identified according to methods known in the art. See, for instance, Geysen et 
al., supra. Further still, U.S: Patent No. 5,194,392 to Geysen (1990) describes a 
general method of detecting or determining the sequence of monomers (amino acids 
or other compounds) which is a topological equivalent of the epitope (i.e., a 
"mimotope") which is complementary to a particular paratope (antigen binding 
site) of an antibody of interest. More generally, U.S. Patent No. 4,433,092 to 
Geysen (1989) describes a method of detecting or determining a sequence of 
monomers which is a topographical equivalent of a ligand which is complementary 
to the ligand binding site of a particular receptor of interest. Similarly, U.S. Patent 
No. 5,480,971 to Houghten, R. A. et al. (1996) on Peralkylated Oligopeptide 
Mixtures discloses linear Cl-C7-alkyl peralkylated oligopeptides and sets and 
libraries of such peptides, as well as methods for using such oligopeptide sets and 
libraries for determining the sequence of a peralkylated oligopeptide that 
preferentially binds to an acceptor molecule of interest. Thus, non-peptide 
analogs of the epitope-bearing peptides of the invention also can be made 

routinely by these methods. 

Fragments or portions of the polypeptides of the present invention may be 
employed for producing the corresponding full-length polypeptide by peptide 
synthesis; therefore, the fragments may be employed as intermediates for producing 
the full-length polypeptides. Fragments or portions of the polynucleotides of the 
present invention may be used to synthesize full-length polynucleotides of the 
present invention. 

The present invention also relates to vectors which include polynucleotides 
of the present invention, host cells which are genetically engineered with vectors of 
the invention and the production of polypeptides of the invention by recombinant 
techniques. 
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Host cells are genetically engineered (transduced or transformed or 
transfected) with the vectors of this invention which may be, for example, a cloning 
vector or an expression vector. The vector may be, for example, in the form of a 
plasmid, a viral particle, a phage, etc. The engineered host cells can be cultured in 
conventional nutrient media modified as appropriate for activating promoters, 
selecting transformants or amplifying the genes of the present invention. The 
culture conditions, such as temperature, pH and the like, are those previously used 
with the host cell selected for expression, and will be apparent to the ordinarily 
skilled artisan. 

The polynucleotides of the present invention may be employed for 
producing polypeptides by recombinant techniques. Thus, for example, the 
polynucleotide "may be included in any one of a variety of expression vectors for 
expressing a polypeptide. Such vectors include chromosomal, nonchromosomal 
and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage 
i 15 DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids 

* and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and 

3 pseudorabies. However, any other vector may be used as long as it is replicable 

^ and viable in the host. 

0 The appropriate DNA sequence may be inserted into the vector by a variety 

of procedures. In general, the DNA sequence is inserted into an appropriate 
restriction endonuclease site(s) by procedures known in the art. Such procedures 
and others are d~emed to be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an 
appropriate expression control sequence(s) (promoter) to direct mRNA synthesis. 
25 As representative examples of such promoters, there may be mentioned: LTR or 
SV40 promoter, the E. coli. lac or trp, the phage lambda P L promoter and other 
promoters known to control expression of genes in prokaryotic or eukaryotic cells 
or their viruses. The expression vector also contains a ribosome binding site for 
translation initiation and a transcription terminator. The vector may also include 
30 appropriate sequences for amplifying expression. 

In addition, the expression vectors preferably contain one or more selectable 
marker genes to provide a phenotypic trait for selection of transformed host cells 
such as dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, 
or such as tetracycline or ampicillin resistance in E. coli. 
35 The vector containing the appropriate DNA sequence as hereinabove 

described, as well as an appropriate promoter or control sequence, may be 
employed to transform an appropriate host to permit the host to express the protein. 
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As representative examples of appropriate hosts, there may be mentioned: 
bacterial cells, such as E. coli . Strentomvces. Salmonella typhimurium ; fungal cells, 
such as yeast; insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells 
such as CHO, COS or Bowes melanoma; adenoviruses; plant cells, etc. The 
selection of an appropriate host is deemed to be within the scope of those skilled in 
the art from the teachings herein. 

More particularly, the present invention also includes recombinant 
constructs comprising one or more of the sequences as broadly described above. 
The constructs comprise a vector, such as a plasmid or viral vector, into which a 
sequence of the invention has been inserted, in a forward or reverse orientation. In a 
preferred aspect of this embodiment, the construct further comprises regulatory 
sequences, including, for example, a promoter, operably linked to the sequence. 
Large numbers of suitable vectors and promoters are known to those of skill in the 
art and are commercially available. The following vectors are provided by way of 
example; Bacterial: P QE70, pQE60, pQE-9 (Qiagen), P BS, pDIO, phagescript, 
psiX174, pBluescript SK, pBSKS, pNH8A, pNH16a, pNH18A, pNH46A 
(Stratagene);. pTRC99a, pKK223-3, P KK233-3, pDR540, pRIT5 (Pharmacia); 
Eukaryotic: pWLNEO, pSV2CAT, pOG44, pXTl, pSG (Stratagene) pSVK3, 
pBPV, pMSG, pSVL (Pharmacia). However, any other plasmid or vector may be 
used as long as they are replicable and viable in the host. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 
promoters include lad, lacZ, T3, T7, gpt, lambda Pr, P L and trp. Eukaryotic 
promoters include CMV immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the art. 

In a further embodiment, the present invention relates to host cells 
containing the above-described constructs. The host cell can be a higher eukaryotic 
cell, such as a mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or 
the host cell can be a prokaryotic cell, such as a bacterial cell. Introduction of the 
construct into the host cell can be effected by calcium phosphate transfection, 
DEAE-Dextran mediated transfection, or electroporation (Davis, L., Dibner, M., 
Battey, I., Basic Methods in Molecular Biology, (1986)). 

The constructs in host cells can be used in a conventional manner to produce 
the gene product encoded by the recombinant sequence. Alternatively, the 
polypeptides of the invention can be synthetically produced by conventional peptide 
synthesizers. 
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Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 
other cells under the control of appropriate promoters. Cell-free translation systems 
can also be employed to produce such proteins using RNAs derived from the DNA 
constructs of the present invention. Appropriate cloning and expression vectors for 
use with prokaryotic and eukaryotic hosts are described by Sambrook, et al., 
Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, 
N.Y., (1989), the disclosure of which is hereby incorporated by reference. 

Transcription of the DNA encoding the polypeptides of the present invention 

by higher eukaryotes is increased by inserting an enhancer sequence into the vector. 
Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp that act 
on a promoter to increase its transcription. Examples include the SV40 enhancer on 
the late side of the replication origin bp 100 to 270, a cytomegalovirus early 
promoter enhancer, the polyoma enhancer on the late side of the replication origin, 

and adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of replication 
and selectable markers permitting transformation of the host cell, e.g., the ampicillin 
resistance gene of E_coli and S_cerevisiae TRP1 gene, and a promoter derived 
from a highly-expressed gene to direct transcription of a downstream structural 
sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3 -phosphogly cerate kinase (PGK), a-factor, acid phosphatase, or 
heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination 
sequences, and preferably, a leader sequence capable of directing secretion of 
translated protein into the periplasmic space or extracellular medium. Optionally, 
the heterologous sequence can encode a fusion protein including an N-terminal 
identification peptide imparting desired characteristics, e.g., stabilization or 
simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a 
structural DNA sequence encoding a desired protein together with suitable 
translation initiation and termination signals in operable reading phase with a 
functional promoter. The vector will comprise one or more phenotypic selectable 
markers and t.u origin of replication to ensure maintenance of the vector and to, if 
desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli. Bacillus subtilis, Snlnmnrila t vphimurium and 
various species within the genera Pseudomonas, Streptomyces, and 
Staphylococcus, although others may also be employed as a matter of choice. 

As a representative but nonlimiting example, useful expression vectors for 
bacterial use can comprise a selectable marker and bacterial origin of replication 



derived from commercially available plasmids comprising genetic elements of the 
well known cloning vector pBR322 (ATCC 37017). Such commercial vectors 
include, for example, pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) 
and GEM1 (Promega Biotec, Madison, WI, USA). These pBR322 "backbone" 
5 sections are combined with an appropriate promoter and the structural sequence to 
be expressed. 

Following transformation of a suitable host strain and growth of the host 
strain to an appropriate cell density, the selected promoter is induced by appropriate 
means (e.g., temperature shift or chemical induction) and cells are cultured for an 
10 additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Microbial cells employed in expression of proteins can be disrupted by any 
^ convenient method, including freeze-thaw cycling, sonication, mechanical 

\Q 15 disruption, or use of cell lysing agents, such methods are well known to those 

I 3 skilled in the art. 

q Various mammalian cell culture systems can also be employed to express 

recombinant protein. Examples of mammalian expression systems include the 
m COS-7 lines of monkey kidney fibroblasts, described by Gluzman, Cell, 23:175 

s 20 (1981), and other cell lines capable of expressing a compatible vector, for example, 

™ the C127, 3T3, CHO, HeLa and BHK cell lines. Mammalian expression vectors 

H will comprise an origin of replication, a suitable promoter and enhancer, and also 

^ any necessary ribosome binding sites, polyadenylation site, splice donor and 

IQ acceptor sites, transcriptional termination sequences, and 5' flanking nontranscribed 

25 sequences. DNA sequences derived from the SV40 splice, and polyadenylation 
sites may be used to provide the required nontranscribed genetic elements. 

The polypeptide can be recovered and purified from recombinant cell 
cultures by methods including ammonium sulfate or ethanol precipitation, acid 
extraction, anion or cation exchange chromatography, phosphocellulose 
30 chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatite chromatography and lectin chromatography. 
Protein refolding steps can be used, as necessary, in completing configuration of 
the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 
35 The polypeptides of the present invention may be a naturally purified 

product, or a product of chemical synthetic procedures, or produced by recombinant 
techniques from a prokaryotic or eukaryotic host (for example, by bacterial, yeast, 
higher plant, insect and mammalian cells in culture). Depending upon the host 



employed in a recombinant production procedure, the polypeptides of the present 
invention may be glycosylated or may be non-glycosylated. Polypeptides of the 
invention may also include an initial methionine amino acid residue. 

NAF-1 may be employed to treat spinal cord injuries or damage to 
peripheral nerves by increasing spinal cord and sensory neuron attachment and 
neurite outgrowth. 

NAF-1 may also be employed to inhibit tumor cell metastases induced by 
small cell carcinoma. The NAF-1 gene and gene product of the present invention 
may also be employed to reduce primary tumor growth, metastatic potential and 
angiogenesis in human breast carcinoma cells. 

The NAF-1 gene and gene product of the present invention may also be 
employed to promote wound healing due to its ability to promote cell-cell interaction 

and cell adhesion. 

NAF-1 may also be employed to modulate hemostasis. 

The polynucleotides and polypeptides of the present invention may be 
employed as research reagents and materials for discovery of treatments and 
diagnostics to human disease. 

This invention provides a method for identification of the receptor for NAF- 
1. The gene encoding methods known to those of skill in the art, for example, 
ligand panning and FACS sorting (Coligan, et al., Current Protocols in Immun., 
1(2), Chapter 5, (1991)). Preferably, expression cloning is employed wherein 
polyadenylated RNA is prepared from a cell responsive to NAF-1, and a cDNA 
library created from this RNA is divided into pools and used to transfect COS cells 
or other cells that are not responsive to NAF-1. Transfected cells which are grown 
on glass slides are exposed to labeled NAF-1 ligand. NAF-1 can be labeled by a 
variety of means including iodination or inclusion of a recognition site for a site- 
specific protein kinase. Following fixation and incubation, the slides are subjected 
to auto-radiographic analysis. Positive pools are identified and sub-pools are 
prepared and re-transfected using an iterative sub-pooling and re-screening process, 
eventually yielding a single clone that encodes the putative receptor. As an 
alternative approach for receptor identification, labeled ligand can be photoaffinity 
linked with cell membrane or extract preparations that express the receptor 
molecule. Cross-linked material is resolved by PAGE and exposed to X-ray film. 
The labeled complex containing the ligand-receptor can be excised, resolved into 
peptide fragments, and subjected to protein microsequencing. The amino acid 
sequence obtained from microsequencing would be used to design a set of 
degenerate oligonucleotide probes to screen a cDNA library to identify the gene 
encoding the putative receptor. 



This invention provides a method of screening compounds to identify those 
which are agonists to or antagonists to NAF-1. The identification of both type 
compounds would involve a neurite outgrowth assay. COS cells (5 x 10 8 ) are 
transfected with NAF-l/pcDNA-1 (Invitrogen, Inc.) and conditioned medium is 
5 collected. NAF mvc is affinity purified on a monoclonal anti-myc (9E10) affinity- 
purified F-spondin m Y c (20 mg/ml) is absorbed onto nitrocellulose (Lemmon et al., 
1989). For controls, parental COS cell-conditioned medium is purified on the same 
column and used as a substrate on nitrocellulose. The nitrocellulose is then blocked 
with BSA (10 mg/ml), which provided a further control for background neurite 
10 outgrowth. rAT E14 DRG neurons are plated on immobilized protein substrates at 
a density of 2-10 x 10 4 cells per 35 mm tissue culture dish (Nunc) and grown for 
14 hr. Cultures are then fixed in 4% paraformaldehyde, permeabilized with 0.1% 
Triton X-100, and stained using MAb 3A10 (Furley et al., 1990; available from 
f3 Developmental Studies Hybridoma Bank), which recognizes a neuronal filament- 

3 is associated protein and serves as a marker for fine neurites. Neuronal cell bodies 

J* and neurites are visualized by indirect immunofluorescence on a Zeiss Axioplan 

C3 microscope. Neurite lengths are measured as the distance from the edge of the 

C | soma (sharply defined by 3A10 fluorescence) to the tip of its longest neurite. 

fy Neurite lengths are measured if the entire length of the neurite could be 

: 20 unambiguously identified. About 25 neurites are measurable within each protein- 

q coated area (3-4mm^). 

U Rat el 3 dorsal spinal cord neurons can also be assayed by plating the 

Jj{ dissociated cells on immobilized protein substrate at a density of 10 6 cells per 35 

CO mm tissue culture dish (Nunc). After 1 hr. the cultures are washed twice with PBS 

25 and fixed in 4% paraformaldehyde. Cells are counted on a Zeiss Axioplan 

microscope at 400 x magnification. Ten independent counts are taken from each 

experiment. 

An alternative example of identifying agonists and antagonists to the 
polypeptide of the present invention includes expressing the NAF-1 receptor from a 

30 mammalian cell or membrane preparation and incubating that receptor with labeled 
NAF-1 in the presence of a compound. The ability of a compound to enhance or 
block the interaction is then quantified. Alternatively, the response of a known 
second messenger system following interaction of NAF-1 and its receptor would be 
measured and compared in the presence or absence of the compound. Such second 

35 messenger systems include, but are not limited, cAMP guanylate cyclase, ion 
channels or phosphoinositide hydrolysis. 

Potential antagonists include an antibody, or in some cases, an oligopeptide, 
which binds to the polypeptide. Alternatively, a potential antagonist may be a 
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closely related protein which binds to the receptor sites, however, they are inactive 
forms of the polypeptide and thereby prevent the action of NAF-1 since receptor 
sites are occupied. 

Another potential antagonist is an antisense construct prepared using 
5 antisense technology. Antisense technology can be used to control gene expression 
through triple-helix formation or antisense DNA or RNA, both of which methods 
are based on binding of a polynucleotide to DNA or RNA. For example, the 5 ' 
coding portion of the polynucleotide sequence, which encodes for the mature 
polypeptides c. the present invention, is used to design an antisense RNA 
10 oligonucleotide of from about 10 to 40 base pairs in length. A DNA oligonucleotide 
is designed to be complementary to a region of the gene involved in transcription 
(triple helix -see Lee et al., Nucl. Acids Res., 6:3073 (1979); Cooney et al, 
Science, 241:456 (1988); and Dervan et al., Science, 251: 1360 (1991)), thereby 
preventing transcription and the production of NAF-1. The antisense RNA 
15 oligonucleotide hybridizes to the mRNA in vivo and blocks translation of the 
mRNA molecule into NAF-1 polypeptide (Antisense - Okano, J. Neurochem., 
56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, 
CRC Press, Boca Raton, FL (1988)). The oligonucleotides described above can 
also be delivered to cells such that the antisense RNA or DNA may be expressed in 
20 vivo to inhibit production of N AF- 1 . 

Potential antagonists include a small molecule which binds to and occupies 
the catalytic site of the polypeptide thereby making the catalytic site inaccessible to 
substrate such that normal biological activity is prevented. Examples of small 
molecules include but are not limited to small peptides or peptide-like molecules. 

The antagonists may be employed to treat malarial infection induced by 
Plasmodium falciparum. During malarial infection, the polypeptide of the present 
invention may promote adhesion of parasitized red cells to endothelial cells and, 
therefore, antagonists would inhibit this action and prevent malaria. The 
antagonists may also be employed to treat cancer, for example, in blocking 
30 metastasis by inhibiting cell adhesion. 

The polypeptides of the present invention or antagonists and agonists may 
be employed in combination with a suitable pharmaceutical carrier. Such 
compositions comprise a therapeutically effective amount of the polypeptide or 
antagonists or agonist, and a pharmaceutically acceptable carrier or excipient. Such 
35 a carrier includes but is not limited to saline, buffered saline, dextrose, water, 
glycerol, ethanol, and combinations thereof. The formulation should suit the mode 
of administration. 
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The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 
compositions of the invention. Associated with such container(s) can be a notice in 
the form prescribed by a governmental agency regulating the manufacture, use or 
sale of pharmaceuticals or biological products, which notice reflects approval by the 
agency of manufacture, use or sale for human administration. In addition, the 
polypeptides of the present invention or agonists or antagonists may be employed in 
conjunction with other therapeutic compounds. 

The pharmaceutical compositions may be administered in a convenient 
manner such as by the oral, topical, parenterally, intravenous, intraperitoneal, 
intramuscular, subcutaneous, intranasal or intradermal routes. The pharmaceutical 
compositions are administered in an amount which is effective for treating and/or 
prophylaxis of the specific indication. In general, they are administered in an 
amount of at least about 10 |lg/kg body weight and in most cases they will be 
administered in an amount not in excess of about 8 mg/Kg body weight per day. In 
most cases, the dosage is from about 10 u.g/kg to about 1 mg/kg body weight daily, 
taking into account the routes of administration, symptoms, etc. 

The NAF-1 polypeptides and agonists and antagonists which are 
polypeptides may also be employed in accordance with the present invention by 
expression of such polypeptides in vivo, which is often referred to as "gene 
therapy." 

Thus, for example, cells from a patient may be engineered with a 
polynucleotide (DNA or RNA) encoding a polypeptide ex vivo, with the engineered 
cells then being provided to a patient to be treated with the polypeptide. Such 
methods are well-known in the art and are apparent from the teachings herein. For 
example, cells may be engineered by the use of a retroviral plasmid vector 
containing RNA encoding a polypeptide of the present invention. 

Similarly, cells may be engineered in vivo for expression of a polypeptide in 
vivo by, for example, procedures known in the art. For example, a packaging cell 
is transduced with a retroviral plasmid vector containing RNA encoding a 
polypeptide of the present invention such that the packaging cell now produces 
infectious viral particles containing the gene of interest. These producer cells may 
be administered to a patient for engineering cells in vivo and expression of the 
polypeptide in vivo. These and other methods for administering a polypeptide of 
the present invention by such method should be apparent to those skilled in the art 
from the teachings of the present invention. 

Retroviruses from which the retroviral plasmid vectors hereinabove 
mentioned may be derived include, but are not limited to, Moloney Murine 
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Leukemia Virus, spleen necrosis virus, retroviruses such as Rous Sarcoma Virus, 
Harvey Sarcoma Virus, avian leukosis virus, gibbon ape leukemia virus, human 
immunodeficiency virus, adenovirus, Myeloproliferative Sarcoma Virus, and 
mammary tumor virus. In one embodiment, the retroviral plasmid vector is derived 
from Moloney Murine Leukemia Virus. 

The vector includes one or more promoters. Suitable promoters which may 
be employed include, but are not limited to, the retroviral LTR; the SV40 promoter; 
and the human cytomegalovirus (CMV) promoter described in Miller, et al., 
Riotechniaues . Vol. 7, No. 9, 980-990 (1989), or any other promoter (e.g., 
cellular promoters such as eukaryotic cellular promoters including, but not limited 
to, the histone, pol III, and b-actin promoters). Other viral promoters which may 
be employed include, but are not limited to, adenovirus promoters, thymidine 
kinase (TK) promoters, and B19 parvovirus promoters. The selection of a suitable 
3 promoter will be apparent to those skilled in the art from the teachings contained 

*3 15 herein. 

j The nucleic acid sequence encoding the polypeptide of the present invention 

3 is under the control of a suitable promoter. Suitable promoters which may be 

I employed include, but are not limited to, adenoviral promoters, such as the 

U adenoviral major late promoter; or heterologous promoters, such as the 

20 cytomegalovirus (CMV) promoter; the respiratory syncytial virus (RSV) promoter; 
5 inducible promoters, such as the MMT promoter, the metallothionein promoter; heat 

* shock promoters; the albumin promoter; the ApoAI promoter; human globin 

promoters; viral thymidine kinase promoters, such as the Herpes Simplex thymidine 
kinase promoter; retroviral LTRs (including the modified retroviral LTRs 
25 hereinabove described); the b-actin promoter; and human growth hormone 
promoters. The promoter also may be the native promoter which controls the gene 
encoding the polypeptide. 

The retroviral plasmid vector is employed to transduce packaging cell lines 
to form producer cell lines. Examples of packaging cells which may be transfected 
30 include, but are not limited to, the PE501, PA317, y-2, y-AM, PA 12, T19-14X, 
VT-19-17-H2, yCRE, yCRIP, GP+E-86, GP+envAml2, and DAN cell lines as 
described in Miller, Human Gene Therapy . Vol. 1, pgs. 5-14 (1990), which is 
incorporated herein by reference in its entirety. The vector may transduce the 
packaging cells through any means known in the art. Such means include, but are 
35 not limited to, electroporation, the use of liposomes, and CaP04 precipitation. In 
one alternative, the retroviral plasmid vector may be encapsulated into a liposome, 
or coupled to a lipid, and then administered to a host. 



The producer cell line generates infectious retroviral vector particles which 
include the nucleic acid sequence(s) encoding the polypeptides. Such retroviral 
vector particles then may be employed, to transduce eukaryotic cells, either in vitro 
or in vivo. The transduced eukaryotic cells will express the nucleic acid 
sequence(s) encoding the polypeptide. Eukaryotic cells which may be transduced 
include, but are not limited to, embryonic stem cells, embryonic carcinoma cells, as 
well as hematopoietic stem cells, hepatocytes, fibroblasts, myoblasts, keratinocytes, 
endothelial cells, and bronchial epithelial cells. 

This invention is also related to the use of the gene of the present invention 
as a diagnostic. Detection of a mutated form of the gene will allow a diagnosis of a 
disease or a susceptibility to a disease which results from underexpression of NAF- 
1, for example, tumor metastases and tumor angiogenesis. 

Individuals carrying mutations in the gene of the present invention may be 
detected at the DNA level by a variety of techniques. Nucleic acids for diagnosis 
may be obtained from a patient's cells, including but not limited to blood, urine, 
saliva, tissue biopsy and autopsy material. The genomic DNA may be used directly 
for detection or may be amplified enzymatically by using PCR (Saiki et al, Nature, 
324:163-166 (1986)) prior to analysis. RNA or cDNA may also be used for the 
same purpose. As an example, PCR primers complementary to the nucleic acid 
encoding NAF-1 can be used to identify and analyze mutations. For example, 
deletions and insertions can be detected by a change in size of the amplified product 
in comparison to the normal genotype. Point mutations can be identified by 
hybridizing amplified DNA to radiolabeled RNA or alternatively, radiolabeled 
antisense DNA sequences. Perfectly matched sequences can be distinguished from 
mismatched duplexes by RNase A digestion or by differences in melting 
temperatures. 

Sequence differences between the reference gene and genes having 
mutations may be revealed by the direct DNA sequencing method. In addition, 
cloned DNA segments may be employed as probes to detect specific DNA 
segments. The sensitivity of this method is greatly enhanced when combined with 
PCR. For example, a sequencing primer is used with double-stranded PCR 
product or a single-stranded template molecule generated by a modified PCR. The 
sequence determination is performed by conventional procedures with radiolabeled 
nucleotide or by automatic sequencing procedures with fluorescent-tags. 

Genetic testing based on DNA sequence differences may be achieved by 
detection of alteration in electrophoretic mobility of DNA fragments in gels with or 
without denaturing agents. Small sequence deletions and insertions can be 
visualized by high resolution gel electrophoresis. DNA fragments of different 



sequences may be distinguished on denaturing formamide gradient gels in which the 
mobilities of different DNA fragments are retarded in the gel at different positions 
according to their specific melting or partial melting temperatures (see, e.g., Myers 
et aL, Science, 230:1242 (1985)). 

Sequence changes at specific locations may also be revealed by nuclease 
protection assays, such as RNase and SI protection or the chemical cleavage 
method (e.g., Cotton etaL, PNAS, USA, 85:4397-4401 (1985)). 

Thus, the detection of a specific DNA sequence may be achieved by 
methods such as hybridization, RNase protection, chemical cleavage, direct DNA 
sequencing or the use of restriction enzymes, (e.g., Restriction Fragment Length 
Polymorphisms (RFLP)) and Southern blotting of genomic DNA. 

In addition to more conventional gel-electrophoresis and DNA sequencing, 
mutations can also be detected by in situ analysis. 

The present invention also relates to a diagnostic assay for detecting altered 
levels of the polypeptide of the present invention in various tissues since an over- 
expression of the proteins compared to normal control tissue samples can detect the 
presence of NAF-1 and conditions related to an overexpression of NAF-1, for 
example, tumor metastases and angiogenesis. Assays used to detect levels of the 
polypeptide of the present invention in a sample derived from a host are well-known 
to those of skill in the art and include radioimmunoassays, competitive-binding 
assays, Western Blot analysis and preferably an ELISA assay. An ELISA assay 
initially comprises preparing an antibody specific to the NAF-1 antigen, preferably a 
monoclonal antibody. In addition a reporter antibody is prepared against the 
monoclonal antibody. To the reporter antibody is attached a detectable reagent such 
as radioactivity, fluorescence or in this example a horseradish peroxidase enzyme. 
A sample is now removed from a host and incubated on a solid support, e.g. a 
polystyrene dish, that binds the proteins in the sample. Any free protein binding 
sites on the dish are then covered by incubating with a non-specific protein such as 
bovine serum albumin. Next, the monoclonal antibody is incubated in the dish 
during which time the monoclonal antibodies attached to any of the polypeptide of 
the present invention attached to the polystyrene dish. All unbound monoclonal 
antibody is washed out with buffer. The reporter antibody linked to horseradish 
peroxidase is now placed in the dish resulting in binding of the reporter antibody to 
any monoclonal antibody bound to the polypeptide of the present invention. 
Unattached reporter antibody is then washed out. Peroxidase substrates are then 
added to the dish and the amount of color developed in a given time period is a 
measurement of the amount of the polypeptide of the present invention present in a 
given volume of patient sample when compared against a standard curve. 
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A competition assay may be employed wherein antibodies specific to the 
polypeptide of the present invention are attached to a solid support and labeled 
NAF-1 and a sample derived from the host are passed over the solid support and the 
amount of label detected attached to the solid support can be correlated to a quantity 
of the polypeptide of the present invention in the sample. 

The sequences of the present invention are also valuable for chromosome 
identification. The sequence is specifically targeted to and can hybridize with a 
particular location on an individual human chromosome. Moreover, there is a 
current need for identifying particular sites on the chromosome. Few chromosome 
marking reagents based on actual sequence data (repeat polymorphisms) are 
presently available for marking chromosomal location. The mapping of DNAs to 
chromosomes according to the present invention is an important first step in 
correlating those sequences with genes associated with disease. 

Briefly, sequences can be mapped to chromosomes by preparing PCR 
primers (preferably 15-25 bp) from the cDNA. Computer analysis of the 3' 
untranslated region of the gene is used to rapidly select primers that do not span 
more than one exon in the genomic DNA, thus complicating the amplification 
process. These primers are then used for PCR screening of somatic cell hybrids 
containing individual human chromosomes. Only those hybrids containing the 
human gene corresponding to the primer will yield an amplified fragment. 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a 
particular DN A to a particular chromosome. Using the present invention with the 
same oligonucleotide primers, sublocalization can be achieved with panels of 
fragments from specific chromosomes or pools of large genomic clones in an 
analogous manner. Other mapping strategies that can similarly be used to map to its 
chromosome include in situ hybridization, prescreening with labeled flow-sorted 
chromosomes and preselection by hybridization to construct chromosome specific- 
cDNA libraries. 

Fluorescence in situ hybridization (FISH) of a cDNA clone to a metaphase 
chromosomal spread can be used to provide a precise chromosomal location in one 
step. This technique can be used with cDNA having at least 50 or 60 bases. For a 
review of this technique, see Verma et al., Human Chromosomes: a Manual of 
Basic Techniques, Pergamon Press, New York (1988). 

Once a sequence has been mapped to a precise chromosomal location, the 
physical position of the sequence on the chromosome can be correlated with genetic 
map data. Such data are found, for example, in V. McKusick, Mendelian 
Inheritance in Man (available on line through Johns Hopkins University Welch 
Medical Library). The relationship between genes and diseases that have been 
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mapped to the same chromosomal region are then identified through linkage 
analysis (coinheritance of physically adjacent genes). 

Next, it is necessary to determine the differences in the cDNA or genomic 
sequence between affected and unaffected individuals. If a mutation is observed in 
some or all of the affected individuals but not in any normal individuals, then the 
mutation is likely to be the causative agent of the disease. 

With current resolution of physical mapping and genetic mapping 
techniques, a cDNA precisely localized to a chromosomal region associated with the 
disease could be one of between 50 and 500 potential causative genes. (This 
assumes 1 megabase mapping resolution and one gene per 20 kb). 

The polypeptides, their fragments or other derivatives, or analogs thereof, 
or cells expressing them can be used as an immunogen to produce antibodies 
thereto. These antibodies can be, for example, polyclonal or monoclonal 
antibodies. The present invention also includes chimeric, single chain, and 
humanized antibodies, as well as Fab fragments, or the product of an Fab 
expression library. Various procedures known in the art may be used for the 
production of such antibodies and fragments. 

Antibodies generated against the polypeptides corresponding to a sequence 
of the present invention can be obtained by direct injection of the polypeptides into 
an animal or by administering the polypeptides to an animal, preferably a 
nonhuman. The antibody so obtained will then bind the polypeptides itself. In this 
manner even a sequence encoding only a fragment of the polypeptides can be used 
to generate antibodies binding the whole native polypeptides. Such antibodies can 
then be used to isolate the polypeptide from tissue expressing that polypeptide. 

For preparation of monoclonal antibodies, any technique which provides 
antibodies produced by continuous cell line cultures can be used Examples include 
the hybridoma technique (Kohler and Milstein, 1975, Nature, 256:495-497), the 
trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, 
Immunology Today 4:72), and the EBV-hybridoma technique to produce human 
monoclonal antibodies (Cole, et al., 1985, in Monoclonal Antibodies and Cancer 
Therapy, Alan R. Liss, Inc., pp. 77-96). 

Techniques described for the production of single chain antibodies (U.S. 
Patent 4,946,778) can be adapted to produce single chain antibodies to 
immunogenic polypeptide products of this invention. Also, transgenic mice may be 
used to express humanized antibodies to immunogenic polypeptide products of this 
invention. 

The above-described antibodies may be employed to isolate or to identify 
clones expressing the polypeptide or to purify the polypeptide of the present 



invention by attachment of the antibody to a solid support for isolation and/or 
purification by affinity chromatography. 

The present invention will be further described with reference to the 
following examples; however, it is to be understood that the present invention is not 
5 limited to such examples. All parts or amounts, unless otherwise specified, are by 
weight. 

In order to facilitate understanding of the following examples certain 
frequently occurring methods and/or terms will be described. 

"Plasmids" are designated by a lower case p preceded and/or followed by 
10 capital letters and/or numbers. The starting plasmids herein are either commercially 
available, publicly available on an unrestricted basis, or can be constructed from 
available plasmids in accord with published procedures. In addition, equivalent 
plasmids to those described are known in the art and will be apparent to the 
O ordinarily skilled artisan. 

v3 15 "Digestion" of DNA refers to catalytic cleavage of the DNA with a 

Cf restriction enzyme that acts only at certain sequences in the DNA. The various 

52 restriction enzymes used herein are commercially available and their reaction 

| conditions, cofactors and other requirements were used as would be known to the 

fU ordinarily skilled artisan. For analytical purposes, typically 1 mg of plasmid or 

" A 20 DNA fragment is used with about 2 units of enzyme in about 20 ml of buffer 

solution. For the purpose of isolating DNA fragments for plasmid construction, 
typically 5 to 50 mg of DNA are digested with 20 to 250 units of enzyme in a larger 
volume. Appropriate buffers and substrate amounts for particular restriction 
enzymes are specified by the manufacturer. Incubation times of about 1 hour at 
25 37~>C are ordinarily used, but may vary in accordance with the supplier's 
instructions. After digestion the reaction is electrophoresed directly on a 
polyacrylamide gel to isolate the desired fragment. 

Size separation of the cleaved fragments is performed using 8 percent 
polyacrylamide gel described by Goeddel, D. et al, Nucleic Acids Res., 8:4057 
30 (1980). 

"Oligonucleotides" refers to either a single stranded polydeoxynucleotide or 
two complementary polydeoxynucleotide strands which may be chemically 
synthesized. Such synthetic oligonucleotides have no 5' phosphate and thus will 
not ligate to another oligonucleotide without adding a phosphate with an ATP in the 
35 presence of a kinase. A synthetic oligonucleotide will ligate to a fragment that has 
not been dephosphorylated. 
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"Ligation" refers to the process of forming phosphodiester bonds between 
two double stranded nucleic acid fragments (Maniatis, T., et al., Id., p. 146). 
Unless otherwise provided, ligation may be accomplished using known buffers and 
conditions with 10 units of T4 DNA ligase ("ligase") per 0.5 mg of approximately 
equimolar amounts of the DNA fragments to be ligated. Unless otherwise 
stated, transformation was performed as described in the method of Graham, F. and 
Van der Eb, A., Virology, 52:456-457 (1973). 
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Rxample 1 

Bacterial Expression and Purific ation of NAF-1 
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The DNA sequence encoding NAF-1, ATCC # 97343, is initially amplified using 
PCR oligonucleotide primers corresponding to the 5' sequences of the processed 
NAF-1 protein (minus the signal peptide sequence) and the vector sequences 3' to 
the NAF-1 gene. Additional nucleotides corresponding to NAF-1 are added to the 
5' and 3' sequences respectively. The 5' oligonucleotide primer has the sequence 
5' GCCATACGGGATCCCAGCCTCTTGGGGGAGAGTCC 3' (SEQ ID NO:3) 
contains a BamHI restriction enzyme site followed by 21 nucleotides of NAF-1 
coding sequence starting from the presumed terminal amino acid of the processed 
protein codon. The 3' sequence 5' 

GGCATACGTCTAGATTAGACGCAGTTATCAGGGAC 3' (SEQ ID NO:4) 
contains complementary sequences to an Xbal site and is followed by 21 
nucleotides of NAF-1. The restriction enzyme sites correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9 (Qiagen, Inc. Chatsworth, 
CA). pQE-9 encodes antibiotic resistance (Amp r ), a bacterial origin of replication 
(ori), an IPTG-regulatable promoter operator (P/O), a ribosome binding site (RBS), 
a 6-His tag and restriction enzyme sites. pQE-9 is then digested with BamHI and 
Xbal. The amplified sequences are ligated into pQE-9 and are inserted in frame 
with the sequence encoding for the histidine tag and the RBS. The ligation mixture 
is then used to transform E. coli strain M15/rep 4 (Qiagen, Inc.) by the procedure 
described in Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, Cold 
Spring Laboratory Press, (1989). M15/rep4 contains multiple copies of the plasmid 
pREP4, which expresses the lad repressor and also confers kanamycin resistance 
(Kan r ). Transformants are identified by their ability to grow on LB plates and 
ampicillin/kanamycin resistant colonies are selected. Plasmid DNA is isolated and 
confirmed by restriction analysis. Clones containing the desired constructs are 
grown overnight (O/N) in liquid culture in LB media supplemented with both Amp 
(100 ug/ml) and Kan (25 ug/ml). The O/N culture is used to inoculate a large 
culture at a ratio of 1:100 to 1:250. The cells are grown to an optical density 600 
(O.D.600) 0 f between 0.4 and 0.6. IPTG ("Isopropyl-B-D-thiogalacto 
pyranoside") is then added to a final concentration of 1 mM. IPTG induces by 
inactivating the lad repressor, clearing the P/O leading to increased gene 
expression. Cells are grown an extra 3 to 4 hours. Cells are then harvested by 
centrifugation. The cell pellet is solubilized in the chaotropic agent 6 Molar 
Guanidine HC1. After clarification, solubilized NAF-1 is purified from this solution 
by chromatography on a Nickel-Chelate column under conditions that allow for 
tight binding by proteins containing the 6-His tag. NAF-1 is eluted from the 
column in 6 molar guanidine HC1 pH 5.0 and for the purpose of renaturation 
adjusted to 3 molar guanidine HC1, lOOmM sodium phosphate, 10 mmolar 



glutathione (reduced) and 2 mmolar glutathione (oxidized). After incubation in this 
solution for 12 hours the protein is dialyzed to 10 mmolar sodium phosphate. 

Example 2 

5 Cloning and expression of NAF-1 using the baculoviru s expression system 

The DNA sequence encoding the full length NAF-1 protein, ATCC # 
97343, was amplified using PCR oligonucleotide primers corresponding to the 5' 
and 3' sequences of the gene: 

The 5' primer has the sequence 5' GCCATACGGGATCCGCC 

10 ATC ATG GAAAACCCCAGCCCGGCC 3' (SEQ ID NO:5) and contains a BamHI 
restriction enzyme site (in bold) followed by 8 nucleotides resembling an efficient 
signal for the initiation of translation in eukaryotic cells (Kozak, M., J. Mol. Biol., 
196:947-950 (1987) which is just behind the first 21 nucleotides of the NAF-1 gene 
(the initiation codon for translation " ATG" is underlined). 

15 The 3' primer has the sequence 5' GGCATACGTCTAGATTA 

GACGCAGTTATCAGGGAC 3' (SEQ ID NO:6) and contains the cleavage site for 
the restriction endonuclease Xbal and 21 nucleotides complementary to the 3' end 
of the translated sequence of the NAF-1 gene. The amplified sequences were 
isolated from a 1% agarose gel using a commercially available kit ("Geneclean," 

20 BIO 101 Inc., La Jolla, Ca.). The fragment was then digested with the 
endonucleases BamHI and Xbal and then purified again on a 1% agarose gel. This 
fragment is designated F2. 

The vector pRGl (modification of pVL941 vector, discussed below) is used 
for the expression of the NAF-1 protein using the baculovirus expression system 

25 (for review see: Summers, M.D. and Smith, G.E. 1987, A manual of methods for 
baculovirus vectors and insect cell culture procedures, Texas Agricultural 
Experimental Station Bulletin No. 1555). This expression vector contains the 
strong polyhedrin promoter of the Autographa californica nuclear polyhedrosis 
virus (AcMNPV) followed by the recognition sites for the restriction endonucleases 

30 BamHI and Xbal. The polyadenylation site of the simian virus (SV)40 is used for 
efficient polyadenylation. For an easy selection of recombinant virus the beta- 
galactosidase gene from E.coli is inserted in the same orientation as the polyhedrin 
promoter followed by the polyadenylation signal of the polyhedrin gene. The 
polyhedrin sequences are flanked at both sides by viral sequences for the cell- 

35 mediated homologous recombination of co-transfected wild-type viral DNA. Many 
other baculovirus vectors could be used in place of pRGl such as pAc373, pVL941 
and pAcIMl (Luckow, V.A. and Summers, M.D., Virology, 170:31-39). 



The plasmid was digested with the restriction enzymes BamHI and Xbal 
and then dephosphorylated using calf intestinal phosphatase by procedures known 
in the art. The DNA was then isolated from a 1% agarose gel using the 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). This vector 
DNA is designated V2. 

Fragment F2 and the dephosphorylated plasmid V2 were ligated with T4 
DNA ligase. E.coli XL1 blue cells were then transformed and bacteria identified 
that contained the plasmid (pBacNAF-1) with the NAF-1 gene using the enzymes 
BamHI and Xbal. The sequence of the cloned fragment was confirmed by DNA 
sequencing. 

5 mg of the plasmid pBacNAF-1 was co-transfected with 1.0 mg of a 
commercially available linearized baculovirus ("BaculoGold' baculovirus DNA", 
Pharmingen, San Diego, CA.) using the lipofection method (Feigner et al. Proc. 
Natl. Acad. Sci. USA, 84:7413-7417 (1987)). 

lmg of BaculoGold' virus DNA and 5 mg of the plasmid pBacNAF-1 were 
mixed in a sterile well of a microliter plate containing 50 ml of serum free Grace's 
medium (Life Technologies Inc., Gaithersburg, MD). Afterwards 10 ml Lipofectin 
plus 90 ml Grace's medium were added, mixed and incubated for 15 minutes at 
room temperature. Then the transfection mixture was added drop-wise to the Sf9 
insect cells (ATCC CRL 1711) seeded in a 35 mm tissue culture plate with 1 ml 
Grace's medium without serum. The plate was rocked back and forth to mix the 
newly added solution. The plate was then incubated for 5 hours at 27ooC. After 5 
hours the transfection solution was removed from the plate and 1 ml of Grace's 
insect medium supplemented with 10% fetal calf serum was added. The plate was 
put back into an incubator and cultivation continued at 27°°C for four days. 

After four days the supernatant was collected and a plaque assay performed 
similar as described by Summers and Smith (supra). As a modification an agarose 
gel with "Blue Gal" (Life Technologies Inc., Gaithersburg) was used which allows 
an easy isolation of blue stained plaques. (A detailed description of a "plaque 
assay" can also be found in the user's guide for insect cell culture and 
baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9-10). 

Four days after the serial dilution the virus was added to the cells and blue 
stained plaques were picked with the tip of an Eppendorf pipette. The agar 
containing the recombinant viruses was then resuspended in an Eppendorf tube 
containing 200 ml of Grace's medium. ' The agar was removed by a brief 
centrifugation and the supernatant containing the recombinant baculovirus was used 
to infect Sf9 cells seeded in 35 mm dishes. Four days later the supernatants of 
these culture dishes were harvested and then stored at 4ooC. 



Sf9 cells were grown in Grace's medium supplemented with 10% heat- 
inactivated FBS. The cells were infected with the recombinant baculovirus V-NAF- 
1 at a multiplicity of infection (MOI) of 2. Six hours later the medium was removed 
and replaced with SF900 II medium minus methionine and cysteine (Life 
5 Technologies Inc., Gaithersburg). 42 hours later 5 mCi of 35 S-methionine and 5 
mCi 35 S cysteine (Amersham) were added. The cells were further incubated for 16 
hours before they were harvested by centrifugation and the labelled proteins 
visualized by SDS-PAGE and autoradiography. 

10 Example 3 

Expression of Recombinant NAF-1 in COS cells 

Expression of plasmid, NAF-1 HA is derived from a vector pcDNAI/Amp 
(Invitrogen) containing: 1) SV40 origin of replication, 2) ampicillin resistance 
gene, 3) E.coli replication origin, 4) CMV promoter followed by a polylinker 

15 region, an SY4Q intron and polyadenylation site. A DNA fragment encoding the 
entire NAF-1 precursor and a HA tag fused in frame to its 3* end is cloned into the 
polylinker region of the vector, placing the recombinant protein expression under 
control of the CMV promoter. The HA tag corresponds to an epitope derived from 
the influenza hemagglutinin protein as previously described (I. Wilson, H. Niman, 

20 R. Heighten, A Cherenson, M. Connolly, and R. Lerner, 1984, Cell 37:767, 
(1984)). The fusion of HA tag to the NAF-1 protein allows easy detection of the 
recombinant protein with an antibody that recognizes the HA epitope. 
The plasmid construction strategy is described as follows: 
The DNA sequence encoding NAF-1, ATCC # 97343, is constructed by 

25 PCR using two primers as described in the above examples. The 5' primer contains 
a convenient restriction site followed by a portion of NAF-1 coding sequence 
starting from the initiation codon; the 3' sequence contains complementary 
sequences to a convenient restriction site, translation stop codon, HA tag and the 
last several nucleotides of the NAF-1 coding sequence (not including the stop 

30 codon). Therefore, the PCR product contains a convenient 5' and 3' restriction 
sites, NAF-1 coding sequence followed by HA tag fused in frame, and a translation 
termination stop codon next to the HA tag. The PCR amplified DNA fragment and 
the vector, pcDNAI/Amp, are digested and ligated. The ligation mixture is 
transformed into E. coli strain SURE (available from Stratagene Cloning Systems, 

35 11099 North Torrey Pines Road, La Jolla, CA 92037) the transformed culture is 
plated on ampicillin media plates and resistant colonies are selected. Plasmid DNA 
is isolated from transformants and examined by restriction analysis for the presence 
of the correct fragment. For expression of the recombinant NAF-1, COS cells are 
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transfected with the expression vector by DEAE-DEXTRAN method (J. Sambrook, 
E Fritsch T Maniatis, Molecular Cloning: A Laboratory Manual, Cold Spring 
Laboratory Press, (1989)). The expression of the NAF-1 HA protein is detected by 
radiolabelling and immunoprecipitation method (E. Harlow, D. Lane, Antibody: A 
Laboratory Manual, Cold Spring Harbor Laboratory Press, (1988)). Cells are 
labelled for 8 hours with 35 S -cysteine two days post transfection. Culture media is 
then collected and cells were lysed with detergent (RIPA buffer (150 mM NaCl, 1% 
NP-40, 0.1% SDS, 1% NP-40, 0.5% DOC, 50mM Tris, pH 7.5) (Wilson, I. et 
al Id.37:767 (1984)). Both cell lysate and culture media are precipitated with an 
HA specific monoclonal antibody. Proteins precipitated are analyzed on 15% SDS- 
PAGE gels. 

Example 4 

Re pression v'« C T ene Therapy 

Fibroblasts are obtained from a subject by skin biopsy. The resulting tissue 
is placed in tissue-culture medium and separated into small pieces. Small chunks of 
the tissue are placed on a wet surface of a tissue culture flask, approximately ten 
pieces are placed in each flask. The flask is turned upside down, closed tight and 
left at room temperature over night. After 24 hours at room temperature, the flask is 
inverted and the chunks of tissue remain fixed to the bottom of the flask and fresh 
media (e.g., Ham's F12 media, with 10% FBS, penicillin and streptomycin, is 
added This is then incubated at 37~C for approximately one week. At this time, 
fresh media is added and subsequently changed every several days. After an 
additional two weeks in culture, a monolayer of fibroblasts emerge. The monolayer 
is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier, P.T. et al. DNA, 7:219-25 (1988) flanked by the long 
terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
HindHI and subsequently treated with calf intestinal phosphatase. The linear vector 
is fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention is amplified 
using PCR primers which correspond to the 5' and 3' end sequences respectively. 
The 5' primer containing an EcoRI site and the 3' primer further includes a HindHI 
site Equal quantities of the Moloney murine sarcoma virus linear backbone and the 
amplified EcoRI and HindHI fragment are added together, in the presence of T4 
DNA lipase. The resulting mixture is maintained under conditions appropriate for 
ligation & of the two fragments. The ligation mixture is used to transform bactena 
HB101, which are then plated onto agar-containing kanamycin for the purpose of 
confirming that the vector had the gene of interest properly inserted. 



The amphotropic pA317 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 
10% calf serum (CS), penicillin and streptomycin. The MSV vector containing the 
gene is then added to the media and the packaging cells are transduced with the 
vector. The packaging cells now produce infectious viral particles containing the 
gene (the packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, 
the media is harvested from a 10 cm plate of confluent producer cells. The spent 
media, containing the infectious viral particles, is filtered through a millipore filter to 
remove detached producer cells and this media is then used to infect fibroblast cells. 
Media is removed from a sub-confluent plate of fibroblasts and quickly replaced 
with the media from the producer cells. This media is removed and replaced with 
fresh media. If the titer of virus is high, then virtually all fibroblasts will be infected 
and no selection is required. If the titer is very low, then it is necessary to use a 
retroviral vector that has a selectable marker, such as neo or his. 

The engineered fibroblasts are then injected into the host, either alone or 
after having been grown to confluence on cytodex 3 microcanier beads. The 
fibroblasts now produce the protein product. 

Numerous modifications and variations of the present invention are possible 
in light of the above teachings and, therefore, within the scope of the appended 
claims, the invention may be practiced otherwise than as particularly described. 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 
(i) 



APPLICANT: HASTINGS, £REGG, 
PATRICK J. DILLON 



(ii) TITLE OF INVENTION A HUMAN NEURONAL ATTACHMENT FACTOR-1 
(iii) NUMBER OF SEQUENCES A 18 

(iv) CORRESPONDENCE ADDRE2 

(A) ADDRESSEE: HUMAbi GENOME SCIENCES , INC . 

(B) STREET: 9410 KEY\ WEST AVENUE 

(C) CITY: ROCKVILLE 

(D) STATE: MD 

(E) COUNTRY: USA 

(F) ZIP : 20850 

(v) COMPUTER READABLE FORMl 

(A) MEDIUM TYPE: Flop&y disk 

(B) COMPUTER: IBM PC Compatible 

(C) OPERATING SYSTEM: \PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In\ Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DAT^ 

(A) APPLICATION NUMBER :l US 

(B) FILING DATE: ll-FEByi997 

(C) CLASSIFICATION : 

(viii) ATTORNEY /AGENT INFORMATIC 

(A) NAME: BROOKES , ANDERS A. 

(B) REGISTRATION NUMBER : \ 3 6, 373 

(C) REFERENCE/DOCKET NUMBER: PF226 



(ix) 




TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (301) 309-8504 

(B) TELEFAX: (301) 309-85^2 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1105 base pair5 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 19. . 1011 

(ix) FEATURE: 

(A) NAME /KEY : mat_peptide 

(B) LOCATION : 19 . . 963 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : l\: 



46 

ATC GCC AGG .GTG ACA CTG GTG CGG CTG CGA CAG AGC CCC AGG GCC TTC 771 

lie Ala Arg Val Thr Leu Val Arg Leu Arg Gin Ser Pro Arg Ala Phe 
240 245 250 

ATC CCT CCC GCC CCA GTC CTG CCC AGC AGG GAC AAT GAG ATT GTA GAC 

lie Pro Pro Ala Pro Val Leu Pro Ser Arg Asp Asn Glu He Val Asp 
255 260 265 
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AGC GCC TCA GTT CCA GAA ACG CCG CTG GAC TGC GAG GTC TCC CTG TGG 8 67 

Ser Ala Ser Val Pro Glu Thr Pro Leu Asp Cys Glu Val Ser Leu Trp 
270 275 280 

TCG TCC TGG GGA CTG TGC GGA GCC CAC TGT GGG AGG CTC GGG ACC AAG 915 
Ser Ser Trp Gly Leu Cys Gly Gly His Cys Gly Arg Leu Gly Thr Lys a 
285 290 295 

AGC AGG ACT CGC TAC GTC CGG GTC CAG CCC GCC AAC AAC GGG AGC CCC 963 
Ser Arg Thr Arg Tyr Val Arg Val Gin Pro Ala Asn Asn Gly Ser Pro 
300 305 310 * 315 

TGC CCC GAG CTC GAA GAA GAG GCT GAG TGC GTC CCT GAT AAC TGC GTC 1011 
Cys Pro Glu Leu Glu Glu Glu Ala Glu Cys Val Pro Asp Asn Cys Val 
320 325 330 

T AAG AC CAG A GCCCCGCAGC CCCTGGGGCC CCCCGGAGCC ATGGGGTGTC GGGGGCTCCT 1071 
GTGCAGGCTC ATGCTGCAGG CGGCCGAGGG CACA 1105 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 331 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Glu Asn Pro Ser Pro Ala Ala Ala Leu Gly Lys Ala Leu Cys Ala 
15 10 15 

Leu Leu Leu Ala Thr Leu Gly Ala Ala Gly Gin Pro Leu Gly Gly Glu 
20 25 30 

Ser He Cys Ser Ala Arg Ala Leu Ala Lys Tyr Ser He Thr Phe Thr 
35 40 45 

Gly Lys Trp Ser Gin Thr Ala Phe Pro Lys Gin Tyr Pro Leu Phe Arg 
50 55 60 

Pro Pro Ala Gin Trp Ser Ser Leu Leu Gly Ala Ala His Ser Ser Asp 
65 70 75 80 

Tyr Ser Met Trp Arg Lys Asn Gin Tyr Val Ser Asn Gly Leu Arg Asp 
85 90 95 

Phe Ala Glu Arg Gly Glu Ala Trp Ala Leu Met Lys Glu lie Glu Ala 
100 105 HO 



47 



Ala Gly Glu Ala Leu Gin Ser Val His Ala Val Phe Ser Ala Pro Ala 
115 120 125 

Val Pro Ser Gly Thr Gly Gin Thr Ser Ala Glu Leu Glu Val Gin Arg ■ 
130 135 140 

Arq His Ser Leu Val Ser Phe Val Val Arg He Val Pro Ser Pro Asp 
145 150 I" . 160 

Trp Phe Val Gly Val Asp Ser Leu Asp Leu Cys Asp Gly Asp Arg Trp 
165 1^0 175 

Arq Glu Gin Ala Ala Leu Asp Leu Tyr Pro Tyr Asp Ala Gly Thr Asp 
180 s 185 190 

Ser Gly Phe Thr Phe Ser Ser Pro Asn Phe Ala Thr He Pro Gin Asp 
195 200 205 

Thr Val Thr Glu He Thr Ser Ser Ser Pro Ser His Pro'Ala Asn Ser 
210 ' . 215 220 

Phe Tyr Tyr Pro Arg Leu Lys Ala Leu Pro Pro He Ala Arg Val Thr 
225 230 235 240 

Leu Val Arg Leu Arg Gin Ser Pro Arg Ala Phe He Pro Pro Ala Pro 
245 250 255 

Val Leu Pro Ser Arg Asp Asn Glu He Val Asp Ser Ala Ser Val Pro 
260 265 270 

Glu Thr Pro Leu Asp Cys Glu Val Ser Leu Trp Ser Ser Trp Gly Leu 
275 280 285 

Cys Gly Gly His Cys Gly Arg Leu Gly Thr Lys Ser Arg Thr Arg Tyr 
290 295 300 

Val Arq Val Gin Pro Ala Asn Asn Gly Ser Pro Cys Pro Glu Leu Glu 
305 310 315 320 

Glu Glu Ala Glu Cys Val Pro Asp Asn Cys Val 
325 330 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
GCCATACGGG ATCCCCAGCC TCTTGGGGGA GAGTCC 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS : 
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(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
GGCATACGTC TAGATTAGAC GCAGTTATCA GGGAC 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
■ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
• GCCATACGGG ATCCGCCATC ATGGAAAACC CCAGCCCGGC C 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
GGCATACGTC TAGATTAGAC GCAGTTATCA GGGAC 
(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 392 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 
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41 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Pro Thr Gly Thr Gly Cys Val He Leu Lys Ala Ser He Val Gin Lys 
1 5 10 15 

Arq He He Tyr Phe Gin Asp Glu Gly Ser Leu Thr Lys Lys Leu Cys 
20 25 30 
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Glu Gin Asp Pro Thr Leu Asp Gly Val Thr Asp Arg Pro lie Leu Asp 
35 40 45 

Cys Cys Ala Cys Gly Thr Ala Lys Tyr Arg Leu Thr Phe Tyr Gly Asn 
50 55 60 

Trp Ser Glu Lys Thr His Pro Lys Asp Tyr Pro Arg Arg Ala Asn His 
65 70 75 80 

Trp Ser Ala lie lie Gly Gly Ser His Ser Lys Asn Tyr Val Leu Trp 
85 90 95 

Glu Tyr Gly Gly Tyr Ala Ser Glu Gly Val Lys Gin Val Ala Glu Leu 
100 105 HO 

Gly Ser Pro Val Lys Met Glu Glu Glu lie Arg Gin Gin Ser Asp Glu 
115 120 125 

Val "Leu Thr Val lie Lys Ala Lys Ala Gin Trp Pro Ser Trp Gin Pro 
130 135 140 

Val Asn Val Arg Ala Ala Pro Ser Ala Glu Phe Ser Val Asp Arg Thr 
145 150 155 160 

Arg His Leu Met Ser Phe Leu Thr Met Met Gly Pro Ser Pro Asp Trp 
165 170 175 

Asn Val Gly Leu Ser Ala Glu Asp Leu Cys Thr Lys Glu Cys Gly Trp 
180 185 190 

Val Gin Lys Val Val Gin Asp Leu He Pro Trp Asp Ala Gly Thr Asp 
195 200 205 

Ser Gly Val Thr Tyr Glu Ser Pro Asn Lys Pro Thr He Pro Gin Glu 
210 215 220 

Lys He Arg Pro Leu Thr Ser Leu Asp His Pro Gin Ser Pro Phe Tyr 
225 230 235 240 

Asp Pro Glu Gly Gly Ser He Thr Gin Val Ala Arg Val Val He Glu 
245 250 255 

Arg He Ala Arg Lys Gly Glu Gin Cys Asn He Val Pro Asp Asn Val 
260 265 270 

Asp Asp He Val Ala Asp Leu Ala Pro Glu Glu Lys Asp Glu Asp Asp 
275 " 280 285 

Thr Pro Glu Thr Cys He Tyr Ser Asn Trp Ser Pro Trp Ser Ala Cys 
290 295 300 

Ser Ser Ser Thr Cys Glu Lys Gly Lys Arg Met Arg Gin Arg Met Leu 
305 310 315 320 

Lys Ala Gin Leu Asp Leu Ser Val Pro Cys Pro Asp Thr Gin Asp Phe 
325 330 335 



Gin Pro Cys Met Gly Pro Gly Cys Ser Asp Glu Asp Gly Ser Thr Cys 
340 345 350 
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Thr Met Ser Glu Trp He Thr Trp Ser Pro Cys Ser Val Ser Cys Gly 
355 360 365 

Met Gly Met Arg Ser Arg Glu Arg Tyr Val Lys Gin Phe Pro Glu Asp 
370 375 380 



Gly Ser Val Cys Met Leu Pro Thr 



385 



390 



INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Cys He Tyr Ser Asn Trp Ser Pro Trp Ser Ala Cys Ser Ser Ser Thr 
1 5 10 15 

Cys Glu Lys Gly Lys Arg Met Arg Gin Arg Met Leu Lys Ala Gin Leu 
20 25 30 

Asp Leu Ser Val Pro Cys Pro Asp Thr Gin Asp Phe Gin Pro Cys Met 
35 40 45 



Gly Pro Gly Cys 
50 



INFORMATION FOR SEQ ID NO : 9 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Cys Thr Met Ser Glu Trp He Thr Trp Ser Pro Cys Ser Val Ser Cys 
1 5 10 15 

Gly Met Gly Met Arg Ser Arg Glu Arg Tyr Val Lys Gin Phe Pro Glu 
20 25 30 

Asp Gly Ser Val Cys Met Leu Pro Thr Glu Glu Thr Glu Lys Cys Thr 
35 40 45 



Val Asn Glu Glu Cys 
50 

) INFORMATION FOR SEQ ID NO: 10: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Cvs Leu Val Thr Glu Trp Gly Glu Trp Asp Asp Cys Ser Ala Thr Cys 
.1 5 10 15 

Gly Met Gly Met Lys Lys Arg His Arg Met Val Lys Met Ser Pro Ala 
20 25 30 

Asp Gly Ser Met Cys Lys Ala Glu Thr Ser Gin Ala Glu Lys Cys Met 
35 40 45 

Met Pro Glu Cys 

□ 50 

U (2) INFORMATION FOR SEQ ID NO: 11: 

^ (i) SEQUENCE CHARACTERISTICS: 
y (A) LENGTH: 51 amino acids 

O (B) TYPE: amino acid 

-P (C) STRANDEDNESS: single 

jfU (D) TOPOLOGY: linear 

jU (ii) MOLECULE TYPE: protein 

f~! (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

£ Cys Leu Leu Ser Pro Trp Ser Glu Trp Ser Asp Cys Ser Val Thr Cys 

CO i 5 10 15 

Gly Lys Gly Met Arg Thr Arg Gin Arg Met Leu Lys Ser Leu Ala Glu 
20 25 30 

Leu Gly Asp Cys Asn Glu Asp Leu Glu Gin Ala Glu Lys Cys Met Leu 
35 40 45 

Pro Glu Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Cys Glu Leu Ser Glu Trp Ser Gin Trp, Ser Glu Cys Asn Lys Ser Cys 
1 5 10 

Gly Lys Gly His Met He Arg Thr Arg Thr He Gin Met Glu Pro Gin 



20 



Phe 



Gly Gly Ala Pro Cys Pro Glu Thr Val Gin Arg Lys Lys Cys Arg 



35 



40 



Ala Arg Lys Cys 
50 

INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 
' (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID.NO:13: 
Cys Arg Met Arg Pro Trp Thr Ala Trp Ser Glu Cys Thr Lys Leu Cys 
1 ^ 

Gly Gly Gly He Gin Glu Arg Tyr Met Thr Val Lys Lys Arg Phe Lys 
20 25 

Asp Lys Lys Glu He Arg Ala Cys 



40 45 



Ser Ser Gin Phe Thr Ser Cys Lys 
35 

Asn Val His Pro Cys 
50 

INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single . 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Cys Leu Val Ser Glu Trp Ser Glu Trp Ser Asp Cys Ser Thr Cys Gly 
1 5 10 

Lys Gly Met Arg Ser Arg Thr Arg Met Val Lys Met Ser Pro Ala Asp 
20 25 

Gly Ser Pro Cys Pro Asp Thr Glu Glu Ala Glu Lys Cys Met Val Pro 
35 40 45 



Glu Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 506 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



i) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



GAATTCGGCA 


NAGGNNAAAC 


CCCAGCCCGG 


CTGCCGCCCT 


GGGCAAGdCC 


TNCTGCGCTC 


60 


TCCTCCTGGC 


CACTCTCGGC 


GCCGGCACCA 


GCCTCTTGGG 


GGAGAGTCCA 


TCTNTTCCGC 


120 


CAGAGCCCCG 


GCCAAATACA 


GCATCACCTT 


CACGGGCAAG 


TGGAGCCAGA 


CGGCCTTCCC 


180 


CAAGCAGTAC 


CCCCTGTTCC 


GCCCCCCTGC 


GCATGGTNTT 


CGCTGCTGGG 


GGCCGCGCAT 


240 


AGCTCCGACT 


ACAGCATGTG 


GAGGAAGAAC 


CAGTACGTCA 


TAAACGGGCT 


GCGCGACTTT 


300 


NCGGAGCGGC 


GAGGCCTNGG 


NCGTTGATGA 


AGGAGATCCG 


GGNGGCGGGG 


GAGGCGTNCA 


360 


ANAGGTGNCA 


AGAGTTNTTT 


TCGGGGCCCG 


GTTCCCCAAN 


GGNAACNGGN 


AAACGTTGGG 


420 


GGNTTTNNAG 


TTTNAAGAAG 


NAATTNTTGG 


TTTTTTTTTG 


GGTGGGATTT 


TNCCAACCCN 


480 
506 


ATTGTTTNTG 


GGNTGGAAAA 


TTNGAC 











(2) INFORMATION FOR SEQ ID NO : 1 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



.i) SEQUENCE DESCRIPTION: SEQ ID NO : 1 6 : 



GGCANNGCCA 


GTACGTCATA 


ACGGGCTGCG 


CGACTTTGCG 


GANGCGGCGA 


GGCCTGGGCG 


60 


CTGATGAAGG 


AGATCAAGGC 


GGCGGGGGAG 


GCGCTGCAGA 


GGTGCACGAG 


GTGTTTTCGG 


120 


CGCCCGGTNN 


CCCAGCGNCA 


CCNGGCAGAC 


GTCGGCGAAC 


TGGNAGGTGC 


AGCGCAGGCA 


180 


CTCGCTGGTC 


TCGTTTGTGG 


TGCGCATCGT 


GCCCAGCCCC 


GACTGGTTCG 


TGGGCGTGGA 


240 


CAGCCTGGGA 


CCTGTGANAA 


CGGGGACCTT 


TNGCGNGNAA 


CAGGCGNCGT 


TGGACCTGTA 


300 



54 



NCCCTACGAC GNCGGG 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 316 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GGCANNGCCA GTACGTCATA ACGGGCTGCG* CGACTTTGCG GANGCGGOGA GGCCTGGGCG 
CTGATGAAGG AGATCAAGGC GGCGGGGGAG GCGCTGCAGA GGTGCACGAG GTGTTTTCGG 
CGCCCGGTNN CCCAGCGNCA CCNGGCAGAC GTCGGCGAAC TGGNAGGTGC AGCGCAGGCA 
CTCGCTGGTC TCGTTTGTGG TGCGCATCGT GCCCAGCCCC GACTGGTTCG TGGGCGTGGA 
CAGCCTGGGA CCTGTGANAA CGGGGACCTT TNGCGNGNAA CAGGCGNCGT TGGACCTGTA 
NCCCTACGAC GNCGGG 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
Cys Glu Val Ser Leu Tr P Ser Ser Trp Gly Leu Cys Gly Gly His Cys 
1 5 10 

Gly Arg Leu Gly Thr Lys Ser Arg Thr Arg Tyr Val Arg Val Gin Pro 

20 25 
Ala Asn Asn Gly Ser Pro Cys Pro Glu Leu Glu Glu Glu Ala Glu Cys 
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35 



40 



Val Pro Asp Asn Cys 
50 



55 



(2) INFORMATION Ft>R SEQ ID NO: 16: 

(i) SEQUENCE \CHARACTERISTICS : 

(A) LENGTH : 316 base pairs 

(B) TYPE \ nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIaDGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 16 : 
GGCANNGCCA GTACGTCATA ACGGGCTGCG CGACTTTGCG GANGCGGCGA GGCCTGGGCG 
CTGATGAAGG AGATCAAGGC GSCGGGGGAG GCGCTGCAGA GGTGCACGAG GTGTTTTCGG 
CGCCCGGTNN CCCAGCGNCA CCNGGCAGAC GTCGGCGAAC TGGNAGGTGC AGCGCAGGCA 
CTCGCTGGTC TCGTTTGTGG TGCbCATCGT GCCCAGCCCC GACTGGTTCG TGGGCGTGGA 
CAGCCTGGGA CCTGTGANAA CGG^GACCTT TNGCGNGNAA CAGGCGNCGT TGGACCTGTA 
NCCCTACGAC GNCGGG 

(2) INFORMATION FOR SEQ ID kO:17: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 316 bake pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: dingle 

(D) TOPOLOGY: lineal 
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(ii) MOLECULE TYPE: DNA (denomic) 



(xi) SEQUENCE DESCRIPTION: BEQ ID NO:17: 
GGCANNGCCA GTACGTCATA ACGGGCTGCQ CGACTTTGCG GANGCGGCGA GGCCTGGGCG 
CTGATGAAGG AGATCAAGGC GGCGGGGGAG \gCGCTGCAGA GGTGCACGAG GTGTTTTCGG 
CGCCCGGTNN CCCAGCGNCA CCNGGCAGAC GTCGGCGAAC TGGNAGGTGC AGCGCAGGCA 
CTCGCTGGTC TCGTTTGTGG TGCGCATCGT GCCCAGCCCC GACTGGTTCG TGGGCGTGGA 
CAGCCTGGGA CCTGTGANAA CGGGGACCTT TNGCGNGNAA CAGGCGNCGT TGGACCTGTA 
NCCCTACGAC GNCGGG 
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